forked from analogdevicesinc/linux
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADC pins enabled on PEC_POWER connector #1
Open
FrankBuss
wants to merge
1
commit into
parallella:xcomm_zynq
Choose a base branch
from
FrankBuss:xcomm_zynq
base: xcomm_zynq
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit 561a4fe upstream. As trace event triggers are now part of the mainline kernel, I added my trace event trigger tests to my test suite I run on all my kernels. Now these tests get run under different config options, and one of those options is CONFIG_PROVE_RCU, which checks under lockdep that the rcu locking primitives are being used correctly. This triggered the following splat: =============================== [ INFO: suspicious RCU usage. ] 3.15.0-rc2-test+ #11 Not tainted ------------------------------- kernel/trace/trace_events_trigger.c:80 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 4 locks held by swapper/1/0: #0: ((&(&j_cdbs->work)->timer)){..-...}, at: [<ffffffff8104d2cc>] call_timer_fn+0x5/0x1be #1: (&(&pool->lock)->rlock){-.-...}, at: [<ffffffff81059856>] __queue_work+0x140/0x283 #2: (&p->pi_lock){-.-.-.}, at: [<ffffffff8106e961>] try_to_wake_up+0x2e/0x1e8 #3: (&rq->lock){-.-.-.}, at: [<ffffffff8106ead3>] try_to_wake_up+0x1a0/0x1e8 stack backtrace: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.15.0-rc2-test+ #11 Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006 0000000000000001 ffff88007e083b98 ffffffff819f53a5 0000000000000006 ffff88007b0942c0 ffff88007e083bc8 ffffffff81081307 ffff88007ad96d20 0000000000000000 ffff88007af2d840 ffff88007b2e701c ffff88007e083c18 Call Trace: <IRQ> [<ffffffff819f53a5>] dump_stack+0x4f/0x7c [<ffffffff81081307>] lockdep_rcu_suspicious+0x107/0x110 [<ffffffff810ee51c>] event_triggers_call+0x99/0x108 [<ffffffff810e8174>] ftrace_event_buffer_commit+0x42/0xa4 [<ffffffff8106aadc>] ftrace_raw_event_sched_wakeup_template+0x71/0x7c [<ffffffff8106bcbf>] ttwu_do_wakeup+0x7f/0xff [<ffffffff8106bd9b>] ttwu_do_activate.constprop.126+0x5c/0x61 [<ffffffff8106eadf>] try_to_wake_up+0x1ac/0x1e8 [<ffffffff8106eb77>] wake_up_process+0x36/0x3b [<ffffffff810575cc>] wake_up_worker+0x24/0x26 [<ffffffff810578bc>] insert_work+0x5c/0x65 [<ffffffff81059982>] __queue_work+0x26c/0x283 [<ffffffff81059999>] ? __queue_work+0x283/0x283 [<ffffffff810599b7>] delayed_work_timer_fn+0x1e/0x20 [<ffffffff8104d3a6>] call_timer_fn+0xdf/0x1be^M [<ffffffff8104d2cc>] ? call_timer_fn+0x5/0x1be [<ffffffff81059999>] ? __queue_work+0x283/0x283 [<ffffffff8104d823>] run_timer_softirq+0x1a4/0x22f^M [<ffffffff8104696d>] __do_softirq+0x17b/0x31b^M [<ffffffff81046d03>] irq_exit+0x42/0x97 [<ffffffff81a08db6>] smp_apic_timer_interrupt+0x37/0x44 [<ffffffff81a07a2f>] apic_timer_interrupt+0x6f/0x80 <EOI> [<ffffffff8100a5d8>] ? default_idle+0x21/0x32 [<ffffffff8100a5d6>] ? default_idle+0x1f/0x32 [<ffffffff8100ac10>] arch_cpu_idle+0xf/0x11 [<ffffffff8107b3a4>] cpu_startup_entry+0x1a3/0x213 [<ffffffff8102a23c>] start_secondary+0x212/0x219 The cause is that the triggers are protected by rcu_read_lock_sched() but the data is dereferenced with rcu_dereference() which expects it to be protected with rcu_read_lock(). The proper reference should be rcu_dereference_sched(). Cc: Tom Zanussi <[email protected]> Signed-off-by: Steven Rostedt <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit 10164c2 upstream. Fix driver new_id sysfs-attribute removal deadlock by making sure to not hold any locks that the attribute operations grab when removing the attribute. Specifically, usb_serial_deregister holds the table mutex when deregistering the driver, which includes removing the new_id attribute. This can lead to a deadlock as writing to new_id increments the attribute's active count before trying to grab the same mutex in usb_serial_probe. The deadlock can easily be triggered by inserting a sleep in usb_serial_deregister and writing the id of an unbound device to new_id during module unload. As the table mutex (in this case) is used to prevent subdriver unload during probe, it should be sufficient to only hold the lock while manipulating the usb-serial driver list during deregister. A racing probe will then either fail to find a matching subdriver or fail to get the corresponding module reference. Since v3.15-rc1 this also triggers the following lockdep warning: ====================================================== [ INFO: possible circular locking dependency detected ] 3.15.0-rc2 analogdevicesinc#123 Tainted: G W ------------------------------------------------------- modprobe/190 is trying to acquire lock: (s_active#4){++++.+}, at: [<c0167aa0>] kernfs_remove_by_name_ns+0x4c/0x94 but task is already holding lock: (table_lock){+.+.+.}, at: [<bf004d84>] usb_serial_deregister+0x3c/0x78 [usbserial] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (table_lock){+.+.+.}: [<c0075f84>] __lock_acquire+0x1694/0x1ce4 [<c0076de8>] lock_acquire+0xb4/0x154 [<c03af3cc>] _raw_spin_lock+0x4c/0x5c [<c02bbc24>] usb_store_new_id+0x14c/0x1ac [<bf007eb4>] new_id_store+0x68/0x70 [usbserial] [<c025f568>] drv_attr_store+0x30/0x3c [<c01690e0>] sysfs_kf_write+0x5c/0x60 [<c01682c0>] kernfs_fop_write+0xd4/0x194 [<c010881c>] vfs_write+0xbc/0x198 [<c0108e4c>] SyS_write+0x4c/0xa0 [<c000f880>] ret_fast_syscall+0x0/0x48 -> #0 (s_active#4){++++.+}: [<c03a7a28>] print_circular_bug+0x68/0x2f8 [<c0076218>] __lock_acquire+0x1928/0x1ce4 [<c0076de8>] lock_acquire+0xb4/0x154 [<c0166b70>] __kernfs_remove+0x254/0x310 [<c0167aa0>] kernfs_remove_by_name_ns+0x4c/0x94 [<c0169fb8>] remove_files.isra.1+0x48/0x84 [<c016a2fc>] sysfs_remove_group+0x58/0xac [<c016a414>] sysfs_remove_groups+0x34/0x44 [<c02623b8>] driver_remove_groups+0x1c/0x20 [<c0260e9c>] bus_remove_driver+0x3c/0xe4 [<c026235c>] driver_unregister+0x38/0x58 [<bf007fb4>] usb_serial_bus_deregister+0x84/0x88 [usbserial] [<bf004db4>] usb_serial_deregister+0x6c/0x78 [usbserial] [<bf005330>] usb_serial_deregister_drivers+0x2c/0x4c [usbserial] [<bf016618>] usb_serial_module_exit+0x14/0x1c [sierra] [<c009d6cc>] SyS_delete_module+0x184/0x210 [<c000f880>] ret_fast_syscall+0x0/0x48 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(table_lock); lock(s_active#4); lock(table_lock); lock(s_active#4); *** DEADLOCK *** 1 lock held by modprobe/190: #0: (table_lock){+.+.+.}, at: [<bf004d84>] usb_serial_deregister+0x3c/0x78 [usbserial] stack backtrace: CPU: 0 PID: 190 Comm: modprobe Tainted: G W 3.15.0-rc2 analogdevicesinc#123 [<c0015e10>] (unwind_backtrace) from [<c0013728>] (show_stack+0x20/0x24) [<c0013728>] (show_stack) from [<c03a9a54>] (dump_stack+0x24/0x28) [<c03a9a54>] (dump_stack) from [<c03a7cac>] (print_circular_bug+0x2ec/0x2f8) [<c03a7cac>] (print_circular_bug) from [<c0076218>] (__lock_acquire+0x1928/0x1ce4) [<c0076218>] (__lock_acquire) from [<c0076de8>] (lock_acquire+0xb4/0x154) [<c0076de8>] (lock_acquire) from [<c0166b70>] (__kernfs_remove+0x254/0x310) [<c0166b70>] (__kernfs_remove) from [<c0167aa0>] (kernfs_remove_by_name_ns+0x4c/0x94) [<c0167aa0>] (kernfs_remove_by_name_ns) from [<c0169fb8>] (remove_files.isra.1+0x48/0x84) [<c0169fb8>] (remove_files.isra.1) from [<c016a2fc>] (sysfs_remove_group+0x58/0xac) [<c016a2fc>] (sysfs_remove_group) from [<c016a414>] (sysfs_remove_groups+0x34/0x44) [<c016a414>] (sysfs_remove_groups) from [<c02623b8>] (driver_remove_groups+0x1c/0x20) [<c02623b8>] (driver_remove_groups) from [<c0260e9c>] (bus_remove_driver+0x3c/0xe4) [<c0260e9c>] (bus_remove_driver) from [<c026235c>] (driver_unregister+0x38/0x58) [<c026235c>] (driver_unregister) from [<bf007fb4>] (usb_serial_bus_deregister+0x84/0x88 [usbserial]) [<bf007fb4>] (usb_serial_bus_deregister [usbserial]) from [<bf004db4>] (usb_serial_deregister+0x6c/0x78 [usbserial]) [<bf004db4>] (usb_serial_deregister [usbserial]) from [<bf005330>] (usb_serial_deregister_drivers+0x2c/0x4c [usbserial]) [<bf005330>] (usb_serial_deregister_drivers [usbserial]) from [<bf016618>] (usb_serial_module_exit+0x14/0x1c [sierra]) [<bf016618>] (usb_serial_module_exit [sierra]) from [<c009d6cc>] (SyS_delete_module+0x184/0x210) [<c009d6cc>] (SyS_delete_module) from [<c000f880>] (ret_fast_syscall+0x0/0x48) Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
…f the receiver's buffer" [ Upstream commit 362d520 ] This reverts commit ef2820a ("net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer") as it introduced a serious performance regression on SCTP over IPv4 and IPv6, though a not as dramatic on the latter. Measurements are on 10Gbit/s with ixgbe NICs. Current state: [root@Lab200slot2 ~]# iperf3 --sctp -4 -c 192.168.241.3 -V -l 1452 -t 60 iperf version 3.0.1 (10 January 2014) Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64 Time: Fri, 11 Apr 2014 17:56:21 GMT Connecting to host 192.168.241.3, port 5201 Cookie: Lab200slot2.1397238981.812898.548918 [ 4] local 192.168.241.2 port 38616 connected to 192.168.241.3 port 5201 Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.09 sec 20.8 MBytes 161 Mbits/sec [ 4] 1.09-2.13 sec 10.8 MBytes 86.8 Mbits/sec [ 4] 2.13-3.15 sec 3.57 MBytes 29.5 Mbits/sec [ 4] 3.15-4.16 sec 4.33 MBytes 35.7 Mbits/sec [ 4] 4.16-6.21 sec 10.4 MBytes 42.7 Mbits/sec [ 4] 6.21-6.21 sec 0.00 Bytes 0.00 bits/sec [ 4] 6.21-7.35 sec 34.6 MBytes 253 Mbits/sec [ 4] 7.35-11.45 sec 22.0 MBytes 45.0 Mbits/sec [ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec [ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec [ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec [ 4] 11.45-12.51 sec 16.0 MBytes 126 Mbits/sec [ 4] 12.51-13.59 sec 20.3 MBytes 158 Mbits/sec [ 4] 13.59-14.65 sec 13.4 MBytes 107 Mbits/sec [ 4] 14.65-16.79 sec 33.3 MBytes 130 Mbits/sec [ 4] 16.79-16.79 sec 0.00 Bytes 0.00 bits/sec [ 4] 16.79-17.82 sec 5.94 MBytes 48.7 Mbits/sec (etc) [root@Lab200slot2 ~]# iperf3 --sctp -6 -c 2001:db8:0:f101::1 -V -l 1400 -t 60 iperf version 3.0.1 (10 January 2014) Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64 Time: Fri, 11 Apr 2014 19:08:41 GMT Connecting to host 2001:db8:0:f101::1, port 5201 Cookie: Lab200slot2.1397243321.714295.2b3f7c [ 4] local 2001:db8:0:f101::2 port 55804 connected to 2001:db8:0:f101::1 port 5201 Starting Test: protocol: SCTP, 1 streams, 1400 byte blocks, omitting 0 seconds, 60 second test [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.00 sec 169 MBytes 1.42 Gbits/sec [ 4] 1.00-2.00 sec 201 MBytes 1.69 Gbits/sec [ 4] 2.00-3.00 sec 188 MBytes 1.58 Gbits/sec [ 4] 3.00-4.00 sec 174 MBytes 1.46 Gbits/sec [ 4] 4.00-5.00 sec 165 MBytes 1.39 Gbits/sec [ 4] 5.00-6.00 sec 199 MBytes 1.67 Gbits/sec [ 4] 6.00-7.00 sec 163 MBytes 1.36 Gbits/sec [ 4] 7.00-8.00 sec 174 MBytes 1.46 Gbits/sec [ 4] 8.00-9.00 sec 193 MBytes 1.62 Gbits/sec [ 4] 9.00-10.00 sec 196 MBytes 1.65 Gbits/sec [ 4] 10.00-11.00 sec 157 MBytes 1.31 Gbits/sec [ 4] 11.00-12.00 sec 175 MBytes 1.47 Gbits/sec [ 4] 12.00-13.00 sec 192 MBytes 1.61 Gbits/sec [ 4] 13.00-14.00 sec 199 MBytes 1.67 Gbits/sec (etc) After patch: [root@Lab200slot2 ~]# iperf3 --sctp -4 -c 192.168.240.3 -V -l 1452 -t 60 iperf version 3.0.1 (10 January 2014) Linux Lab200slot2 3.14.0+ #1 SMP Mon Apr 14 12:06:40 EDT 2014 x86_64 Time: Mon, 14 Apr 2014 16:40:48 GMT Connecting to host 192.168.240.3, port 5201 Cookie: Lab200slot2.1397493648.413274.65e131 [ 4] local 192.168.240.2 port 50548 connected to 192.168.240.3 port 5201 Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.00 sec 240 MBytes 2.02 Gbits/sec [ 4] 1.00-2.00 sec 239 MBytes 2.01 Gbits/sec [ 4] 2.00-3.00 sec 240 MBytes 2.01 Gbits/sec [ 4] 3.00-4.00 sec 239 MBytes 2.00 Gbits/sec [ 4] 4.00-5.00 sec 245 MBytes 2.05 Gbits/sec [ 4] 5.00-6.00 sec 240 MBytes 2.01 Gbits/sec [ 4] 6.00-7.00 sec 240 MBytes 2.02 Gbits/sec [ 4] 7.00-8.00 sec 239 MBytes 2.01 Gbits/sec With the reverted patch applied, the SCTP/IPv4 performance is back to normal on latest upstream for IPv4 and IPv6 and has same throughput as 3.4.2 test kernel, steady and interval reports are smooth again. Fixes: ef2820a ("net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer") Reported-by: Peter Butler <[email protected]> Reported-by: Dongsheng Song <[email protected]> Reported-by: Fengguang Wu <[email protected]> Tested-by: Peter Butler <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Cc: Matija Glavinic Pecotic <[email protected]> Cc: Alexander Sverdlin <[email protected]> Cc: Vlad Yasevich <[email protected]> Acked-by: Vlad Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
[ Upstream commit dc8eaaa ] When I open the LOCKDEP config and run these steps: modprobe 8021q vconfig add eth2 20 vconfig add eth2.20 30 ifconfig eth2 xx.xx.xx.xx then the Call Trace happened: [32524.386288] ============================================= [32524.386293] [ INFO: possible recursive locking detected ] [32524.386298] 3.14.0-rc2-0.7-default+ analogdevicesinc#35 Tainted: G O [32524.386302] --------------------------------------------- [32524.386306] ifconfig/3103 is trying to acquire lock: [32524.386310] (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff814275f4>] dev_mc_sync+0x64/0xb0 [32524.386326] [32524.386326] but task is already holding lock: [32524.386330] (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff8141af83>] dev_set_rx_mode+0x23/0x40 [32524.386341] [32524.386341] other info that might help us debug this: [32524.386345] Possible unsafe locking scenario: [32524.386345] [32524.386350] CPU0 [32524.386352] ---- [32524.386354] lock(&vlan_netdev_addr_lock_key/1); [32524.386359] lock(&vlan_netdev_addr_lock_key/1); [32524.386364] [32524.386364] *** DEADLOCK *** [32524.386364] [32524.386368] May be due to missing lock nesting notation [32524.386368] [32524.386373] 2 locks held by ifconfig/3103: [32524.386376] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81431d42>] rtnl_lock+0x12/0x20 [32524.386387] #1: (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff8141af83>] dev_set_rx_mode+0x23/0x40 [32524.386398] [32524.386398] stack backtrace: [32524.386403] CPU: 1 PID: 3103 Comm: ifconfig Tainted: G O 3.14.0-rc2-0.7-default+ analogdevicesinc#35 [32524.386409] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [32524.386414] ffffffff81ffae40 ffff8800d9625ae8 ffffffff814f68a2 ffff8800d9625bc8 [32524.386421] ffffffff810a35fb ffff8800d8a8d9d0 00000000d9625b28 ffff8800d8a8e5d0 [32524.386428] 000003cc00000000 0000000000000002 ffff8800d8a8e5f8 0000000000000000 [32524.386435] Call Trace: [32524.386441] [<ffffffff814f68a2>] dump_stack+0x6a/0x78 [32524.386448] [<ffffffff810a35fb>] __lock_acquire+0x7ab/0x1940 [32524.386454] [<ffffffff810a323a>] ? __lock_acquire+0x3ea/0x1940 [32524.386459] [<ffffffff810a4874>] lock_acquire+0xe4/0x110 [32524.386464] [<ffffffff814275f4>] ? dev_mc_sync+0x64/0xb0 [32524.386471] [<ffffffff814fc07a>] _raw_spin_lock_nested+0x2a/0x40 [32524.386476] [<ffffffff814275f4>] ? dev_mc_sync+0x64/0xb0 [32524.386481] [<ffffffff814275f4>] dev_mc_sync+0x64/0xb0 [32524.386489] [<ffffffffa0500cab>] vlan_dev_set_rx_mode+0x2b/0x50 [8021q] [32524.386495] [<ffffffff8141addf>] __dev_set_rx_mode+0x5f/0xb0 [32524.386500] [<ffffffff8141af8b>] dev_set_rx_mode+0x2b/0x40 [32524.386506] [<ffffffff8141b3cf>] __dev_open+0xef/0x150 [32524.386511] [<ffffffff8141b177>] __dev_change_flags+0xa7/0x190 [32524.386516] [<ffffffff8141b292>] dev_change_flags+0x32/0x80 [32524.386524] [<ffffffff8149ca56>] devinet_ioctl+0x7d6/0x830 [32524.386532] [<ffffffff81437b0b>] ? dev_ioctl+0x34b/0x660 [32524.386540] [<ffffffff814a05b0>] inet_ioctl+0x80/0xa0 [32524.386550] [<ffffffff8140199d>] sock_do_ioctl+0x2d/0x60 [32524.386558] [<ffffffff81401a52>] sock_ioctl+0x82/0x2a0 [32524.386568] [<ffffffff811a7123>] do_vfs_ioctl+0x93/0x590 [32524.386578] [<ffffffff811b2705>] ? rcu_read_lock_held+0x45/0x50 [32524.386586] [<ffffffff811b39e5>] ? __fget_light+0x105/0x110 [32524.386594] [<ffffffff811a76b1>] SyS_ioctl+0x91/0xb0 [32524.386604] [<ffffffff815057e2>] system_call_fastpath+0x16/0x1b ======================================================================== The reason is that all of the addr_lock_key for vlan dev have the same class, so if we change the status for vlan dev, the vlan dev and its real dev will hold the same class of addr_lock_key together, so the warning happened. we should distinguish the lock depth for vlan dev and its real dev. v1->v2: Convert the vlan_netdev_addr_lock_key to an array of eight elements, which could support to add 8 vlan id on a same vlan dev, I think it is enough for current scene, because a netdev's name is limited to IFNAMSIZ which could not hold 8 vlan id, and the vlan dev would not meet the same class key with its real dev. The new function vlan_dev_get_lockdep_subkey() will return the subkey and make the vlan dev could get a suitable class key. v2->v3: According David's suggestion, I use the subclass to distinguish the lock key for vlan dev and its real dev, but it make no sense, because the difference for subclass in the lock_class_key doesn't mean that the difference class for lock_key, so I use lock_depth to distinguish the different depth for every vlan dev, the same depth of the vlan dev could have the same lock_class_key, I import the MAX_LOCK_DEPTH from the include/linux/sched.h, I think it is enough here, the lockdep should never exceed that value. v3->v4: Add a huge array of locking keys will waste static kernel memory and is not a appropriate method, we could use _nested() variants to fix the problem, calculate the depth for every vlan dev, and use the depth as the subclass for addr_lock_key. Signed-off-by: Ding Tianhong <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
[ Upstream commit c674ac3 ] Macvlan devices try to avoid stacking, but that's not always successfull or even desired. As an example, the following configuration is perefectly legal and valid: eth0 <--- macvlan0 <---- vlan0.10 <--- macvlan1 However, this configuration produces the following lockdep trace: [ 115.620418] ====================================================== [ 115.620477] [ INFO: possible circular locking dependency detected ] [ 115.620516] 3.15.0-rc1+ analogdevicesinc#24 Not tainted [ 115.620540] ------------------------------------------------------- [ 115.620577] ip/1704 is trying to acquire lock: [ 115.620604] (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80 [ 115.620686] but task is already holding lock: [ 115.620723] (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40 [ 115.620795] which lock already depends on the new lock. [ 115.620853] the existing dependency chain (in reverse order) is: [ 115.620894] -> #1 (&macvlan_netdev_addr_lock_key){+.....}: [ 115.620935] [<ffffffff810d57f2>] lock_acquire+0xa2/0x130 [ 115.620974] [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50 [ 115.621019] [<ffffffffa07296c3>] vlan_dev_set_rx_mode+0x53/0x110 [8021q] [ 115.621066] [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0 [ 115.621105] [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40 [ 115.621143] [<ffffffff815da6be>] __dev_open+0xde/0x140 [ 115.621174] [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170 [ 115.621174] [<ffffffff815daaa9>] dev_change_flags+0x29/0x60 [ 115.621174] [<ffffffff815e7f11>] do_setlink+0x321/0x9a0 [ 115.621174] [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730 [ 115.621174] [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250 [ 115.621174] [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0 [ 115.621174] [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40 [ 115.621174] [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0 [ 115.621174] [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740 [ 115.621174] [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0 [ 115.621174] [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380 [ 115.621174] [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80 [ 115.621174] [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20 [ 115.621174] [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b [ 115.621174] -> #0 (&vlan_netdev_addr_lock_key/1){+.....}: [ 115.621174] [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60 [ 115.621174] [<ffffffff810d57f2>] lock_acquire+0xa2/0x130 [ 115.621174] [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50 [ 115.621174] [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80 [ 115.621174] [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan] [ 115.621174] [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0 [ 115.621174] [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40 [ 115.621174] [<ffffffff815da6be>] __dev_open+0xde/0x140 [ 115.621174] [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170 [ 115.621174] [<ffffffff815daaa9>] dev_change_flags+0x29/0x60 [ 115.621174] [<ffffffff815e7f11>] do_setlink+0x321/0x9a0 [ 115.621174] [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730 [ 115.621174] [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250 [ 115.621174] [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0 [ 115.621174] [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40 [ 115.621174] [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0 [ 115.621174] [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740 [ 115.621174] [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0 [ 115.621174] [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380 [ 115.621174] [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80 [ 115.621174] [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20 [ 115.621174] [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b [ 115.621174] other info that might help us debug this: [ 115.621174] Possible unsafe locking scenario: [ 115.621174] CPU0 CPU1 [ 115.621174] ---- ---- [ 115.621174] lock(&macvlan_netdev_addr_lock_key); [ 115.621174] lock(&vlan_netdev_addr_lock_key/1); [ 115.621174] lock(&macvlan_netdev_addr_lock_key); [ 115.621174] lock(&vlan_netdev_addr_lock_key/1); [ 115.621174] *** DEADLOCK *** [ 115.621174] 2 locks held by ip/1704: [ 115.621174] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff815e6dbb>] rtnetlink_rcv+0x1b/0x40 [ 115.621174] #1: (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40 [ 115.621174] stack backtrace: [ 115.621174] CPU: 3 PID: 1704 Comm: ip Not tainted 3.15.0-rc1+ analogdevicesinc#24 [ 115.621174] Hardware name: Hewlett-Packard HP xw8400 Workstation/0A08h, BIOS 786D5 v02.38 10/25/2010 [ 115.621174] ffffffff82339ae0 ffff880465f79568 ffffffff816ee20c ffffffff82339ae0 [ 115.621174] ffff880465f795a8 ffffffff816e9e1b ffff880465f79600 ffff880465b019c8 [ 115.621174] 0000000000000001 0000000000000002 ffff880465b019c8 ffff880465b01230 [ 115.621174] Call Trace: [ 115.621174] [<ffffffff816ee20c>] dump_stack+0x4d/0x66 [ 115.621174] [<ffffffff816e9e1b>] print_circular_bug+0x200/0x20e [ 115.621174] [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60 [ 115.621174] [<ffffffff810d3172>] ? trace_hardirqs_on_caller+0xb2/0x1d0 [ 115.621174] [<ffffffff810d57f2>] lock_acquire+0xa2/0x130 [ 115.621174] [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80 [ 115.621174] [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50 [ 115.621174] [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80 [ 115.621174] [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80 [ 115.621174] [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan] [ 115.621174] [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0 [ 115.621174] [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40 [ 115.621174] [<ffffffff815da6be>] __dev_open+0xde/0x140 [ 115.621174] [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170 [ 115.621174] [<ffffffff815daaa9>] dev_change_flags+0x29/0x60 [ 115.621174] [<ffffffff811e1db1>] ? mem_cgroup_bad_page_check+0x21/0x30 [ 115.621174] [<ffffffff815e7f11>] do_setlink+0x321/0x9a0 [ 115.621174] [<ffffffff810d394c>] ? __lock_acquire+0x37c/0x1a60 [ 115.621174] [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730 [ 115.621174] [<ffffffff815ea169>] ? rtnl_newlink+0xe9/0x730 [ 115.621174] [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250 [ 115.621174] [<ffffffff810d329d>] ? trace_hardirqs_on+0xd/0x10 [ 115.621174] [<ffffffff815e6dbb>] ? rtnetlink_rcv+0x1b/0x40 [ 115.621174] [<ffffffff815e6de0>] ? rtnetlink_rcv+0x40/0x40 [ 115.621174] [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0 [ 115.621174] [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40 [ 115.621174] [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0 [ 115.621174] [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740 [ 115.621174] [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0 [ 115.621174] [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0 [ 115.621174] [<ffffffff8119d4f8>] ? might_fault+0xa8/0xb0 [ 115.621174] [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0 [ 115.621174] [<ffffffff815cb51e>] ? verify_iovec+0x5e/0xe0 [ 115.621174] [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380 [ 115.621174] [<ffffffff816faa0d>] ? __do_page_fault+0x11d/0x570 [ 115.621174] [<ffffffff810cfe9f>] ? up_read+0x1f/0x40 [ 115.621174] [<ffffffff816fab04>] ? __do_page_fault+0x214/0x570 [ 115.621174] [<ffffffff8120a10b>] ? mntput_no_expire+0x6b/0x1c0 [ 115.621174] [<ffffffff8120a0b7>] ? mntput_no_expire+0x17/0x1c0 [ 115.621174] [<ffffffff8120a284>] ? mntput+0x24/0x40 [ 115.621174] [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80 [ 115.621174] [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20 [ 115.621174] [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b Fix this by correctly providing macvlan lockdep class. Signed-off-by: Vlad Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
[ Upstream commit b14878c ] Currently, it is possible to create an SCTP socket, then switch auth_enable via sysctl setting to 1 and crash the system on connect: Oops[#1]: CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.1-mipsgit-20140415 #1 task: ffffffff8056ce80 ti: ffffffff8055c000 task.ti: ffffffff8055c000 [...] Call Trace: [<ffffffff8043c4e8>] sctp_auth_asoc_set_default_hmac+0x68/0x80 [<ffffffff8042b300>] sctp_process_init+0x5e0/0x8a4 [<ffffffff8042188c>] sctp_sf_do_5_1B_init+0x234/0x34c [<ffffffff804228c8>] sctp_do_sm+0xb4/0x1e8 [<ffffffff80425a08>] sctp_endpoint_bh_rcv+0x1c4/0x214 [<ffffffff8043af68>] sctp_rcv+0x588/0x630 [<ffffffff8043e8e8>] sctp6_rcv+0x10/0x24 [<ffffffff803acb50>] ip6_input+0x2c0/0x440 [<ffffffff8030fc00>] __netif_receive_skb_core+0x4a8/0x564 [<ffffffff80310650>] process_backlog+0xb4/0x18c [<ffffffff80313cbc>] net_rx_action+0x12c/0x210 [<ffffffff80034254>] __do_softirq+0x17c/0x2ac [<ffffffff800345e0>] irq_exit+0x54/0xb0 [<ffffffff800075a4>] ret_from_irq+0x0/0x4 [<ffffffff800090ec>] rm7k_wait_irqoff+0x24/0x48 [<ffffffff8005e388>] cpu_startup_entry+0xc0/0x148 [<ffffffff805a88b0>] start_kernel+0x37c/0x398 Code: dd0900b8 000330f8 0126302d <dcc60000> 50c0fff1 0047182a a48306a0 03e00008 00000000 ---[ end trace b530b0551467f2fd ]--- Kernel panic - not syncing: Fatal exception in interrupt What happens while auth_enable=0 in that case is, that ep->auth_hmacs is initialized to NULL in sctp_auth_init_hmacs() when endpoint is being created. After that point, if an admin switches over to auth_enable=1, the machine can crash due to NULL pointer dereference during reception of an INIT chunk. When we enter sctp_process_init() via sctp_sf_do_5_1B_init() in order to respond to an INIT chunk, the INIT verification succeeds and while we walk and process all INIT params via sctp_process_param() we find that net->sctp.auth_enable is set, therefore do not fall through, but invoke sctp_auth_asoc_set_default_hmac() instead, and thus, dereference what we have set to NULL during endpoint initialization phase. The fix is to make auth_enable immutable by caching its value during endpoint initialization, so that its original value is being carried along until destruction. The bug seems to originate from the very first days. Fix in joint work with Daniel Borkmann. Reported-by: Joshua Kinard <[email protected]> Signed-off-by: Vlad Yasevich <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Neil Horman <[email protected]> Tested-by: Joshua Kinard <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
[ Upstream commit b394745 ] After the call to phy_init_hw failed in phy_attach_direct, phy_detach is called to detach the phy device from its network device. If the attached driver is a generic phy driver, this also detaches the driver. Subsequently phy_resume is called, which assumes without checking that a driver is attached to the device. This will result in a crash such as Unable to handle kernel paging request for data at address 0xffffffffffffff90 Faulting instruction address: 0xc0000000003a0e18 Oops: Kernel access of bad area, sig: 11 [#1] ... NIP [c0000000003a0e18] .phy_attach_direct+0x68/0x17c LR [c0000000003a0e6c] .phy_attach_direct+0xbc/0x17c Call Trace: [c0000003fc0475d0] [c0000000003a0e6c] .phy_attach_direct+0xbc/0x17c (unreliable) [c0000003fc047670] [c0000000003a0ff8] .phy_connect_direct+0x28/0x98 [c0000003fc047700] [c0000000003f0074] .of_phy_connect+0x4c/0xa4 Only call phy_resume if phy_init_hw was successful. Signed-off-by: Guenter Roeck <[email protected]> Acked-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
[ Upstream commit bf63ac7 ] Kelly reported the following crash: IP: [<ffffffff817a993d>] tcf_action_exec+0x46/0x90 PGD 3009067 PUD 300c067 PMD 11ff30067 PTE 800000011634b060 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC CPU: 1 PID: 639 Comm: dhclient Not tainted 3.15.0-rc4+ analogdevicesinc#342 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff8801169ecd00 ti: ffff8800d21b8000 task.ti: ffff8800d21b8000 RIP: 0010:[<ffffffff817a993d>] [<ffffffff817a993d>] tcf_action_exec+0x46/0x90 RSP: 0018:ffff8800d21b9b90 EFLAGS: 00010283 RAX: 00000000ffffffff RBX: ffff88011634b8e8 RCX: ffff8800cf7133d8 RDX: ffff88011634b900 RSI: ffff8800cf7133e0 RDI: ffff8800d210f840 RBP: ffff8800d21b9bb0 R08: ffffffff8287bf60 R09: 0000000000000001 R10: ffff8800d2b22b24 R11: 0000000000000001 R12: ffff8800d210f840 R13: ffff8800d21b9c50 R14: ffff8800cf7133e0 R15: ffff8800cad433d8 FS: 00007f49723e1840(0000) GS:ffff88011a800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff88011634b8f0 CR3: 00000000ce469000 CR4: 00000000000006e0 Stack: ffff8800d2170188 ffff8800d210f840 ffff8800d2171b90 0000000000000000 ffff8800d21b9be8 ffffffff817c55bb ffff8800d21b9c50 ffff8800d2171b90 ffff8800d210f840 ffff8800d21b0300 ffff8800d21b9c50 ffff8800d21b9c18 Call Trace: [<ffffffff817c55bb>] tcindex_classify+0x88/0x9b [<ffffffff817a7f7d>] tc_classify_compat+0x3e/0x7b [<ffffffff817a7fdf>] tc_classify+0x25/0x9f [<ffffffff817b0e68>] htb_enqueue+0x55/0x27a [<ffffffff817b6c2e>] dsmark_enqueue+0x165/0x1a4 [<ffffffff81775642>] __dev_queue_xmit+0x35e/0x536 [<ffffffff8177582a>] dev_queue_xmit+0x10/0x12 [<ffffffff818f8ecd>] packet_sendmsg+0xb26/0xb9a [<ffffffff810b1507>] ? __lock_acquire+0x3ae/0xdf3 [<ffffffff8175cf08>] __sock_sendmsg_nosec+0x25/0x27 [<ffffffff8175d916>] sock_aio_write+0xd0/0xe7 [<ffffffff8117d6b8>] do_sync_write+0x59/0x78 [<ffffffff8117d84d>] vfs_write+0xb5/0x10a [<ffffffff8117d96a>] SyS_write+0x49/0x7f [<ffffffff8198e212>] system_call_fastpath+0x16/0x1b This is because we memcpy struct tcindex_filter_result which contains struct tcf_exts, obviously struct list_head can not be simply copied. This is a regression introduced by commit 33be627 (net_sched: act: use standard struct list_head). It's not very easy to fix it as the code is a mess: if (old_r) memcpy(&cr, r, sizeof(cr)); else { memset(&cr, 0, sizeof(cr)); tcf_exts_init(&cr.exts, TCA_TCINDEX_ACT, TCA_TCINDEX_POLICE); } ... tcf_exts_change(tp, &cr.exts, &e); ... memcpy(r, &cr, sizeof(cr)); the above code should equal to: tcindex_filter_result_init(&cr); if (old_r) cr.res = r->res; ... if (old_r) tcf_exts_change(tp, &r->exts, &e); else tcf_exts_change(tp, &cr.exts, &e); ... r->res = cr.res; after this change, since there is no need to copy struct tcf_exts. And it also fixes other places zero'ing struct's contains struct tcf_exts. Fixes: commit 33be627 (net_sched: act: use standard struct list_head) Reported-by: Kelly Anderson <[email protected]> Tested-by: Kelly Anderson <[email protected]> Cc: David S. Miller <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit 4797ec2 upstream. The following happens when trying to run a kvm guest on a kernel configured for 64k pages. This doesn't happen with 4k pages: BUG: failure at include/linux/mm.h:297/put_page_testzero()! Kernel panic - not syncing: BUG! CPU: 2 PID: 4228 Comm: qemu-system-aar Tainted: GF 3.13.0-0.rc7.31.sa2.k32v1.aarch64.debug #1 Call trace: [<fffffe0000096034>] dump_backtrace+0x0/0x16c [<fffffe00000961b4>] show_stack+0x14/0x1c [<fffffe000066e648>] dump_stack+0x84/0xb0 [<fffffe0000668678>] panic+0xf4/0x220 [<fffffe000018ec78>] free_reserved_area+0x0/0x110 [<fffffe000018edd8>] free_pages+0x50/0x88 [<fffffe00000a759c>] kvm_free_stage2_pgd+0x30/0x40 [<fffffe00000a5354>] kvm_arch_destroy_vm+0x18/0x44 [<fffffe00000a1854>] kvm_put_kvm+0xf0/0x184 [<fffffe00000a1938>] kvm_vm_release+0x10/0x1c [<fffffe00001edc1c>] __fput+0xb0/0x288 [<fffffe00001ede4c>] ____fput+0xc/0x14 [<fffffe00000d5a2c>] task_work_run+0xa8/0x11c [<fffffe0000095c14>] do_notify_resume+0x54/0x58 In arch/arm/kvm/mmu.c:unmap_range(), we end up doing an extra put_page() on the stage2 pgd which leads to the BUG in put_page_testzero(). This happens because a pud_huge() test in unmap_range() returns true when it should always be false with 2-level pages tables used by 64k pages. This patch removes support for huge puds if 2-level pagetables are being used. Signed-off-by: Mark Salter <[email protected]> [[email protected]: removed #ifndef around PUD_SIZE check] Signed-off-by: Catalin Marinas <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit aa07c71 upstream. After setting ACL for directory, I got two problems that caused by the cached zero-length default posix acl. This patch make sure nfsd4_set_nfs4_acl calls ->set_acl with a NULL ACL structure if there are no entries. Thanks for Christoph Hellwig's advice. First problem: ............ hang ........... Second problem: [ 1610.167668] ------------[ cut here ]------------ [ 1610.168320] kernel BUG at /root/nfs/linux/fs/nfsd/nfs4acl.c:239! [ 1610.168320] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 1610.168320] Modules linked in: nfsv4(OE) nfs(OE) nfsd(OE) rpcsec_gss_krb5 fscache ip6t_rpfilter ip6t_REJECT cfg80211 xt_conntrack rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw auth_rpcgss nfs_acl snd_intel8x0 ppdev lockd snd_ac97_codec ac97_bus snd_pcm snd_timer e1000 pcspkr parport_pc snd parport serio_raw joydev i2c_piix4 sunrpc(OE) microcode soundcore i2c_core ata_generic pata_acpi [last unloaded: nfsd] [ 1610.168320] CPU: 0 PID: 27397 Comm: nfsd Tainted: G OE 3.15.0-rc1+ #15 [ 1610.168320] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 1610.168320] task: ffff88005ab653d0 ti: ffff88005a944000 task.ti: ffff88005a944000 [ 1610.168320] RIP: 0010:[<ffffffffa034d5ed>] [<ffffffffa034d5ed>] _posix_to_nfsv4_one+0x3cd/0x3d0 [nfsd] [ 1610.168320] RSP: 0018:ffff88005a945b00 EFLAGS: 00010293 [ 1610.168320] RAX: 0000000000000001 RBX: ffff88006700bac0 RCX: 0000000000000000 [ 1610.168320] RDX: 0000000000000000 RSI: ffff880067c83f00 RDI: ffff880068233300 [ 1610.168320] RBP: ffff88005a945b48 R08: ffffffff81c64830 R09: 0000000000000000 [ 1610.168320] R10: ffff88004ea85be0 R11: 000000000000f475 R12: ffff880068233300 [ 1610.168320] R13: 0000000000000003 R14: 0000000000000002 R15: ffff880068233300 [ 1610.168320] FS: 0000000000000000(0000) GS:ffff880077800000(0000) knlGS:0000000000000000 [ 1610.168320] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1610.168320] CR2: 00007f5bcbd3b0b9 CR3: 0000000001c0f000 CR4: 00000000000006f0 [ 1610.168320] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1610.168320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1610.168320] Stack: [ 1610.168320] ffffffff00000000 0000000b67c83500 000000076700bac0 0000000000000000 [ 1610.168320] ffff88006700bac0 ffff880068233300 ffff88005a945c08 0000000000000002 [ 1610.168320] 0000000000000000 ffff88005a945b88 ffffffffa034e2d5 000000065a945b68 [ 1610.168320] Call Trace: [ 1610.168320] [<ffffffffa034e2d5>] nfsd4_get_nfs4_acl+0x95/0x150 [nfsd] [ 1610.168320] [<ffffffffa03400d6>] nfsd4_encode_fattr+0x646/0x1e70 [nfsd] [ 1610.168320] [<ffffffff816a6e6e>] ? kmemleak_alloc+0x4e/0xb0 [ 1610.168320] [<ffffffffa0327962>] ? nfsd_setuser_and_check_port+0x52/0x80 [nfsd] [ 1610.168320] [<ffffffff812cd4bb>] ? selinux_cred_prepare+0x1b/0x30 [ 1610.168320] [<ffffffffa0341caa>] nfsd4_encode_getattr+0x5a/0x60 [nfsd] [ 1610.168320] [<ffffffffa0341e07>] nfsd4_encode_operation+0x67/0x110 [nfsd] [ 1610.168320] [<ffffffffa033844d>] nfsd4_proc_compound+0x21d/0x810 [nfsd] [ 1610.168320] [<ffffffffa0324d9b>] nfsd_dispatch+0xbb/0x200 [nfsd] [ 1610.168320] [<ffffffffa00850cd>] svc_process_common+0x46d/0x6d0 [sunrpc] [ 1610.168320] [<ffffffffa0085433>] svc_process+0x103/0x170 [sunrpc] [ 1610.168320] [<ffffffffa032472f>] nfsd+0xbf/0x130 [nfsd] [ 1610.168320] [<ffffffffa0324670>] ? nfsd_destroy+0x80/0x80 [nfsd] [ 1610.168320] [<ffffffff810a5202>] kthread+0xd2/0xf0 [ 1610.168320] [<ffffffff810a5130>] ? insert_kthread_work+0x40/0x40 [ 1610.168320] [<ffffffff816c1ebc>] ret_from_fork+0x7c/0xb0 [ 1610.168320] [<ffffffff810a5130>] ? insert_kthread_work+0x40/0x40 [ 1610.168320] Code: 78 02 e9 e7 fc ff ff 31 c0 31 d2 31 c9 66 89 45 ce 41 8b 04 24 66 89 55 d0 66 89 4d d2 48 8d 04 80 49 8d 5c 84 04 e9 37 fd ff ff <0f> 0b 90 0f 1f 44 00 00 55 8b 56 08 c7 07 00 00 00 00 8b 46 0c [ 1610.168320] RIP [<ffffffffa034d5ed>] _posix_to_nfsv4_one+0x3cd/0x3d0 [nfsd] [ 1610.168320] RSP <ffff88005a945b00> [ 1610.257313] ---[ end trace 838254e3e352285b ]--- Signed-off-by: Kinglong Mee <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit 1c4abec upstream. There was a deadlock in monitor mode when we were setting the channel if the channel was not 1. ====================================================== [ INFO: possible circular locking dependency detected ] 3.14.3 #4 Not tainted ------------------------------------------------------- iw/3323 is trying to acquire lock: (&local->chanctx_mtx){+.+.+.}, at: [<ffffffffa062e2f2>] ieee80211_vif_release_channel+0x42/0xb0 [mac80211] but task is already holding lock: (&local->iflist_mtx){+.+...}, at: [<ffffffffa0609e0a>] ieee80211_set_monitor_channel+0x5a/0x1b0 [mac80211] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&local->iflist_mtx){+.+...}: [<ffffffff810d95bb>] __lock_acquire+0xb3b/0x13b0 [<ffffffff810d9ee0>] lock_acquire+0xb0/0x1f0 [<ffffffff817eb9c8>] mutex_lock_nested+0x78/0x4f0 [<ffffffffa06225cf>] ieee80211_iterate_active_interfaces+0x2f/0x60 [mac80211] [<ffffffffa0518189>] iwl_mvm_recalc_multicast+0x49/0xa0 [iwlmvm] [<ffffffffa051822e>] iwl_mvm_configure_filter+0x4e/0x70 [iwlmvm] [<ffffffffa05e6d43>] ieee80211_configure_filter+0x153/0x5f0 [mac80211] [<ffffffffa05e71f5>] ieee80211_reconfig_filter+0x15/0x20 [mac80211] [snip] -> #1 (&mvm->mutex){+.+.+.}: [<ffffffff810d95bb>] __lock_acquire+0xb3b/0x13b0 [<ffffffff810d9ee0>] lock_acquire+0xb0/0x1f0 [<ffffffff817eb9c8>] mutex_lock_nested+0x78/0x4f0 [<ffffffffa0517246>] iwl_mvm_add_chanctx+0x56/0xe0 [iwlmvm] [<ffffffffa062ca1e>] ieee80211_new_chanctx+0x13e/0x410 [mac80211] [<ffffffffa062d953>] ieee80211_vif_use_channel+0x1c3/0x5a0 [mac80211] [<ffffffffa06035ab>] ieee80211_add_virtual_monitor+0x1ab/0x6b0 [mac80211] [<ffffffffa06052ea>] ieee80211_do_open+0xe6a/0x15a0 [mac80211] [<ffffffffa0605a79>] ieee80211_open+0x59/0x60 [mac80211] [snip] -> #0 (&local->chanctx_mtx){+.+.+.}: [<ffffffff810d6cb7>] check_prevs_add+0x977/0x980 [<ffffffff810d95bb>] __lock_acquire+0xb3b/0x13b0 [<ffffffff810d9ee0>] lock_acquire+0xb0/0x1f0 [<ffffffff817eb9c8>] mutex_lock_nested+0x78/0x4f0 [<ffffffffa062e2f2>] ieee80211_vif_release_channel+0x42/0xb0 [mac80211] [<ffffffffa0609ec3>] ieee80211_set_monitor_channel+0x113/0x1b0 [mac80211] [<ffffffffa058fb37>] cfg80211_set_monitor_channel+0x77/0x2b0 [cfg80211] [<ffffffffa056e0b2>] __nl80211_set_channel+0x122/0x140 [cfg80211] [<ffffffffa0581374>] nl80211_set_wiphy+0x284/0xaf0 [cfg80211] [snip] other info that might help us debug this: Chain exists of: &local->chanctx_mtx --> &mvm->mutex --> &local->iflist_mtx Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&local->iflist_mtx); lock(&mvm->mutex); lock(&local->iflist_mtx); lock(&local->chanctx_mtx); *** DEADLOCK *** This deadlock actually occurs: INFO: task iw:3323 blocked for more than 120 seconds. Not tainted 3.14.3 #4 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. iw D ffff8800c8afcd80 4192 3323 3322 0x00000000 ffff880078fdb7e0 0000000000000046 ffff8800c8afcd80 ffff880078fdbfd8 00000000001d5540 00000000001d5540 ffff8801141b0000 ffff8800c8afcd80 ffff880078ff9e38 ffff880078ff9e38 ffff880078ff9e40 0000000000000246 Call Trace: [<ffffffff817ea841>] schedule_preempt_disabled+0x31/0x80 [<ffffffff817ebaed>] mutex_lock_nested+0x19d/0x4f0 [<ffffffffa06225cf>] ? ieee80211_iterate_active_interfaces+0x2f/0x60 [mac80211] [<ffffffffa06225cf>] ? ieee80211_iterate_active_interfaces+0x2f/0x60 [mac80211] [<ffffffffa052a680>] ? iwl_mvm_power_mac_update_mode+0xc0/0xc0 [iwlmvm] [<ffffffffa06225cf>] ieee80211_iterate_active_interfaces+0x2f/0x60 [mac80211] [<ffffffffa0529357>] _iwl_mvm_power_update_binding+0x27/0x80 [iwlmvm] [<ffffffffa0516eb1>] iwl_mvm_unassign_vif_chanctx+0x81/0xc0 [iwlmvm] [<ffffffffa062d3ff>] __ieee80211_vif_release_channel+0xdf/0x470 [mac80211] [<ffffffffa062e2fa>] ieee80211_vif_release_channel+0x4a/0xb0 [mac80211] [<ffffffffa0609ec3>] ieee80211_set_monitor_channel+0x113/0x1b0 [mac80211] [<ffffffffa058fb37>] cfg80211_set_monitor_channel+0x77/0x2b0 [cfg80211] [<ffffffffa056e0b2>] __nl80211_set_channel+0x122/0x140 [cfg80211] [<ffffffffa0581374>] nl80211_set_wiphy+0x284/0xaf0 [cfg80211] This fixes https://bugzilla.kernel.org/show_bug.cgi?id=75541 Reviewed-by: Johannes Berg <[email protected]> Signed-off-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit c5450db upstream. While accessing cur_policy during executing events CPUFREQ_GOV_START, CPUFREQ_GOV_STOP, CPUFREQ_GOV_LIMITS, same mutex lock is not taken, dbs_data->mutex, which leads to race and data corruption while running continious suspend resume test. This is seen with ondemand governor with suspend resume test using rtcwake. Unable to handle kernel NULL pointer dereference at virtual address 00000028 pgd = ed610000 [00000028] *pgd=adf11831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] PREEMPT SMP ARM Modules linked in: nvhost_vi CPU: 1 PID: 3243 Comm: rtcwake Not tainted 3.10.24-gf5cf9e5 #1 task: ee708040 ti: ed61c000 task.ti: ed61c000 PC is at cpufreq_governor_dbs+0x400/0x634 LR is at cpufreq_governor_dbs+0x3f8/0x634 pc : [<c05652b8>] lr : [<c05652b0>] psr: 600f0013 sp : ed61dcb0 ip : 000493e0 fp : c1cc14f0 r10: 00000000 r9 : 00000000 r8 : 00000000 r7 : eb725280 r6 : c1cc1560 r5 : eb575200 r4 : ebad7740 r3 : ee708040 r2 : ed61dca8 r1 : 001ebd24 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: ad61006a DAC: 00000015 [<c05652b8>] (cpufreq_governor_dbs+0x400/0x634) from [<c055f700>] (__cpufreq_governor+0x98/0x1b4) [<c055f700>] (__cpufreq_governor+0x98/0x1b4) from [<c0560770>] (__cpufreq_set_policy+0x250/0x320) [<c0560770>] (__cpufreq_set_policy+0x250/0x320) from [<c0561dcc>] (cpufreq_update_policy+0xcc/0x168) [<c0561dcc>] (cpufreq_update_policy+0xcc/0x168) from [<c0561ed0>] (cpu_freq_notify+0x68/0xdc) [<c0561ed0>] (cpu_freq_notify+0x68/0xdc) from [<c008eff8>] (notifier_call_chain+0x4c/0x8c) [<c008eff8>] (notifier_call_chain+0x4c/0x8c) from [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) from [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) from [<c00aac6c>] (pm_qos_update_bounded_target+0xd8/0x310) [<c00aac6c>] (pm_qos_update_bounded_target+0xd8/0x310) from [<c00ab3b0>] (__pm_qos_update_request+0x64/0x70) [<c00ab3b0>] (__pm_qos_update_request+0x64/0x70) from [<c004b4b8>] (tegra_pm_notify+0x114/0x134) [<c004b4b8>] (tegra_pm_notify+0x114/0x134) from [<c008eff8>] (notifier_call_chain+0x4c/0x8c) [<c008eff8>] (notifier_call_chain+0x4c/0x8c) from [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) from [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) from [<c00ac228>] (pm_notifier_call_chain+0x1c/0x34) [<c00ac228>] (pm_notifier_call_chain+0x1c/0x34) from [<c00ad38c>] (enter_state+0xec/0x128) [<c00ad38c>] (enter_state+0xec/0x128) from [<c00ad400>] (pm_suspend+0x38/0xa4) [<c00ad400>] (pm_suspend+0x38/0xa4) from [<c00ac114>] (state_store+0x70/0xc0) [<c00ac114>] (state_store+0x70/0xc0) from [<c027b1e8>] (kobj_attr_store+0x14/0x20) [<c027b1e8>] (kobj_attr_store+0x14/0x20) from [<c019cd9c>] (sysfs_write_file+0x104/0x184) [<c019cd9c>] (sysfs_write_file+0x104/0x184) from [<c0143038>] (vfs_write+0xd0/0x19c) [<c0143038>] (vfs_write+0xd0/0x19c) from [<c0143414>] (SyS_write+0x4c/0x78) [<c0143414>] (SyS_write+0x4c/0x78) from [<c000f080>] (ret_fast_syscall+0x0/0x30) Code: e1a00006 eb084346 e59b0020 e5951024 (e5903028) ---[ end trace 0488523c8f6b0f9d ]--- Signed-off-by: Bibek Basu <[email protected]> Acked-by: Viresh Kumar <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit ecd15dd upstream. This bug manifests when calling the nft command line tool without nf_tables kernel support. kernel message: [ 44.071555] Netfilter messages via NETLINK v0.30. [ 44.072253] BUG: unable to handle kernel NULL pointer dereference at 0000000000000119 [ 44.072264] IP: [<ffffffff8171db1f>] netlink_getsockbyportid+0xf/0x70 [ 44.072272] PGD 7f2b74067 PUD 7f2b73067 PMD 0 [ 44.072277] Oops: 0000 [#1] SMP [...] [ 44.072369] Call Trace: [ 44.072373] [<ffffffff8171fd81>] netlink_unicast+0x91/0x200 [ 44.072377] [<ffffffff817206c9>] netlink_ack+0x99/0x110 [ 44.072381] [<ffffffffa004b951>] nfnetlink_rcv+0x3c1/0x408 [nfnetlink] [ 44.072385] [<ffffffff8171fde3>] netlink_unicast+0xf3/0x200 [ 44.072389] [<ffffffff817201ef>] netlink_sendmsg+0x2ff/0x740 [ 44.072394] [<ffffffff81044752>] ? __mmdrop+0x62/0x90 [ 44.072398] [<ffffffff816dafdb>] sock_sendmsg+0x8b/0xc0 [ 44.072403] [<ffffffff812f1af5>] ? copy_user_enhanced_fast_string+0x5/0x10 [ 44.072406] [<ffffffff816dbb6c>] ? move_addr_to_kernel+0x2c/0x50 [ 44.072410] [<ffffffff816db423>] ___sys_sendmsg+0x3c3/0x3d0 [ 44.072415] [<ffffffff811301ba>] ? handle_mm_fault+0xa9a/0xc60 [ 44.072420] [<ffffffff811362d6>] ? mmap_region+0x166/0x5a0 [ 44.072424] [<ffffffff817da84c>] ? __do_page_fault+0x1dc/0x510 [ 44.072428] [<ffffffff812b8b2c>] ? apparmor_capable+0x1c/0x60 [ 44.072435] [<ffffffff817d6e9a>] ? _raw_spin_unlock_bh+0x1a/0x20 [ 44.072439] [<ffffffff816dfc86>] ? release_sock+0x106/0x150 [ 44.072443] [<ffffffff816dc212>] __sys_sendmsg+0x42/0x80 [ 44.072446] [<ffffffff816dc262>] SyS_sendmsg+0x12/0x20 [ 44.072450] [<ffffffff817df616>] system_call_fastpath+0x1a/0x1f Signed-off-by: Denys Fedoryshchenko <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit 0430e49 upstream. Commit 8aac627 "move exit_task_namespaces() outside of exit_notify" introduced the kernel opps since the kernel v3.10, which happens when Apparmor and IMA-appraisal are enabled at the same time. ---------------------------------------------------------------------- [ 106.750167] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 106.750221] IP: [<ffffffff811ec7da>] our_mnt+0x1a/0x30 [ 106.750241] PGD 0 [ 106.750254] Oops: 0000 [#1] SMP [ 106.750272] Modules linked in: cuse parport_pc ppdev bnep rfcomm bluetooth rpcsec_gss_krb5 nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache dm_crypt intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel snd_hda_codec_hdmi kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd snd_hda_codec_realtek dcdbas snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi snd_seq_midi_event snd_rawmidi psmouse snd_seq microcode serio_raw snd_timer snd_seq_device snd soundcore video lpc_ich coretemp mac_hid lp parport mei_me mei nbd hid_generic e1000e usbhid ahci ptp hid libahci pps_core [ 106.750658] CPU: 6 PID: 1394 Comm: mysqld Not tainted 3.13.0-rc7-kds+ #15 [ 106.750673] Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A08 09/19/2012 [ 106.750689] task: ffff8800de804920 ti: ffff880400fca000 task.ti: ffff880400fca000 [ 106.750704] RIP: 0010:[<ffffffff811ec7da>] [<ffffffff811ec7da>] our_mnt+0x1a/0x30 [ 106.750725] RSP: 0018:ffff880400fcba60 EFLAGS: 00010286 [ 106.750738] RAX: 0000000000000000 RBX: 0000000000000100 RCX: ffff8800d51523e7 [ 106.750764] RDX: ffffffffffffffea RSI: ffff880400fcba34 RDI: ffff880402d20020 [ 106.750791] RBP: ffff880400fcbae0 R08: 0000000000000000 R09: 0000000000000001 [ 106.750817] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800d5152300 [ 106.750844] R13: ffff8803eb8df510 R14: ffff880400fcbb28 R15: ffff8800d51523e7 [ 106.750871] FS: 0000000000000000(0000) GS:ffff88040d200000(0000) knlGS:0000000000000000 [ 106.750910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 106.750935] CR2: 0000000000000018 CR3: 0000000001c0e000 CR4: 00000000001407e0 [ 106.750962] Stack: [ 106.750981] ffffffff813434eb ffff880400fcbb20 ffff880400fcbb18 0000000000000000 [ 106.751037] ffff8800de804920 ffffffff8101b9b9 0001800000000000 0000000000000100 [ 106.751093] 0000010000000000 0000000000000002 000000000000000e ffff8803eb8df500 [ 106.751149] Call Trace: [ 106.751172] [<ffffffff813434eb>] ? aa_path_name+0x2ab/0x430 [ 106.751199] [<ffffffff8101b9b9>] ? sched_clock+0x9/0x10 [ 106.751225] [<ffffffff8134a68d>] aa_path_perm+0x7d/0x170 [ 106.751250] [<ffffffff8101b945>] ? native_sched_clock+0x15/0x80 [ 106.751276] [<ffffffff8134aa73>] aa_file_perm+0x33/0x40 [ 106.751301] [<ffffffff81348c5e>] common_file_perm+0x8e/0xb0 [ 106.751327] [<ffffffff81348d78>] apparmor_file_permission+0x18/0x20 [ 106.751355] [<ffffffff8130c853>] security_file_permission+0x23/0xa0 [ 106.751382] [<ffffffff811c77a2>] rw_verify_area+0x52/0xe0 [ 106.751407] [<ffffffff811c789d>] vfs_read+0x6d/0x170 [ 106.751432] [<ffffffff811cda31>] kernel_read+0x41/0x60 [ 106.751457] [<ffffffff8134fd45>] ima_calc_file_hash+0x225/0x280 [ 106.751483] [<ffffffff8134fb52>] ? ima_calc_file_hash+0x32/0x280 [ 106.751509] [<ffffffff8135022d>] ima_collect_measurement+0x9d/0x160 [ 106.751536] [<ffffffff810b552d>] ? trace_hardirqs_on+0xd/0x10 [ 106.751562] [<ffffffff8134f07c>] ? ima_file_free+0x6c/0xd0 [ 106.751587] [<ffffffff81352824>] ima_update_xattr+0x34/0x60 [ 106.751612] [<ffffffff8134f0d0>] ima_file_free+0xc0/0xd0 [ 106.751637] [<ffffffff811c9635>] __fput+0xd5/0x300 [ 106.751662] [<ffffffff811c98ae>] ____fput+0xe/0x10 [ 106.751687] [<ffffffff81086774>] task_work_run+0xc4/0xe0 [ 106.751712] [<ffffffff81066fad>] do_exit+0x2bd/0xa90 [ 106.751738] [<ffffffff8173c958>] ? retint_swapgs+0x13/0x1b [ 106.751763] [<ffffffff8106780c>] do_group_exit+0x4c/0xc0 [ 106.751788] [<ffffffff81067894>] SyS_exit_group+0x14/0x20 [ 106.751814] [<ffffffff8174522d>] system_call_fastpath+0x1a/0x1f [ 106.751839] Code: c3 0f 1f 44 00 00 55 48 89 e5 e8 22 fe ff ff 5d c3 0f 1f 44 00 00 55 65 48 8b 04 25 c0 c9 00 00 48 8b 80 28 06 00 00 48 89 e5 5d <48> 8b 40 18 48 39 87 c0 00 00 00 0f 94 c0 c3 0f 1f 80 00 00 00 [ 106.752185] RIP [<ffffffff811ec7da>] our_mnt+0x1a/0x30 [ 106.752214] RSP <ffff880400fcba60> [ 106.752236] CR2: 0000000000000018 [ 106.752258] ---[ end trace 3c520748b4732721 ]--- ---------------------------------------------------------------------- The reason for the oops is that IMA-appraisal uses "kernel_read()" when file is closed. kernel_read() honors LSM security hook which calls Apparmor handler, which uses current->nsproxy->mnt_ns. The 'guilty' commit changed the order of cleanup code so that nsproxy->mnt_ns was not already available for Apparmor. Discussion about the issue with Al Viro and Eric W. Biederman suggested that kernel_read() is too high-level for IMA. Another issue, except security checking, that was identified is mandatory locking. kernel_read honors it as well and it might prevent IMA from calculating necessary hash. It was suggested to use simplified version of the function without security and locking checks. This patch introduces special version ima_kernel_read(), which skips security and mandatory locking checking. It prevents the kernel oops to happen. Signed-off-by: Dmitry Kasatkin <[email protected]> Suggested-by: Eric W. Biederman <[email protected]> Signed-off-by: Mimi Zohar <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
[ Upstream commit 9709674 ] Alexey gave a AddressSanitizer[1] report that finally gave a good hint at where was the origin of various problems already reported by Dormando in the past [2] Problem comes from the fact that UDP can have a lockless TX path, and concurrent threads can manipulate sk_dst_cache, while another thread, is holding socket lock and calls __sk_dst_set() in ip4_datagram_release_cb() (this was added in linux-3.8) It seems that all we need to do is to use sk_dst_check() and sk_dst_set() so that all the writers hold same spinlock (sk->sk_dst_lock) to prevent corruptions. TCP stack do not need this protection, as all sk_dst_cache writers hold the socket lock. [1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel AddressSanitizer: heap-use-after-free in ipv4_dst_check Read of size 2 by thread T15453: [<ffffffff817daa3a>] ipv4_dst_check+0x1a/0x90 ./net/ipv4/route.c:1116 [<ffffffff8175b789>] __sk_dst_check+0x89/0xe0 ./net/core/sock.c:531 [<ffffffff81830a36>] ip4_datagram_release_cb+0x46/0x390 ??:0 [<ffffffff8175eaea>] release_sock+0x17a/0x230 ./net/core/sock.c:2413 [<ffffffff81830882>] ip4_datagram_connect+0x462/0x5d0 ??:0 [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534 [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701 [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682 [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b ./arch/x86/kernel/entry_64.S:629 Freed by thread T15455: [<ffffffff8178d9b8>] dst_destroy+0xa8/0x160 ./net/core/dst.c:251 [<ffffffff8178de25>] dst_release+0x45/0x80 ./net/core/dst.c:280 [<ffffffff818304c1>] ip4_datagram_connect+0xa1/0x5d0 ??:0 [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534 [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701 [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682 [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b ./arch/x86/kernel/entry_64.S:629 Allocated by thread T15453: [<ffffffff8178d291>] dst_alloc+0x81/0x2b0 ./net/core/dst.c:171 [<ffffffff817db3b7>] rt_dst_alloc+0x47/0x50 ./net/ipv4/route.c:1406 [< inlined >] __ip_route_output_key+0x3e8/0xf70 __mkroute_output ./net/ipv4/route.c:1939 [<ffffffff817dde08>] __ip_route_output_key+0x3e8/0xf70 ./net/ipv4/route.c:2161 [<ffffffff817deb34>] ip_route_output_flow+0x14/0x30 ./net/ipv4/route.c:2249 [<ffffffff81830737>] ip4_datagram_connect+0x317/0x5d0 ??:0 [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534 [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701 [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682 [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b ./arch/x86/kernel/entry_64.S:629 [2] <4>[196727.311203] general protection fault: 0000 [#1] SMP <4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp pps_core mdio <4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 3.10.26 #1 <4>[196727.311344] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013 <4>[196727.311364] task: ffff885e6f069700 ti: ffff885e6f072000 task.ti: ffff885e6f072000 <4>[196727.311377] RIP: 0010:[<ffffffff815f8c7f>] [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80 <4>[196727.311399] RSP: 0018:ffff885effd23a70 EFLAGS: 00010282 <4>[196727.311409] RAX: dead000000200200 RBX: ffff8854c398ecc0 RCX: 0000000000000040 <4>[196727.311423] RDX: dead000000100100 RSI: dead000000100100 RDI: dead000000200200 <4>[196727.311437] RBP: ffff885effd23a80 R08: ffffffff815fd9e0 R09: ffff885d5a590800 <4>[196727.311451] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 <4>[196727.311464] R13: ffffffff81c8c280 R14: 0000000000000000 R15: ffff880e85ee16ce <4>[196727.311510] FS: 0000000000000000(0000) GS:ffff885effd20000(0000) knlGS:0000000000000000 <4>[196727.311554] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[196727.311581] CR2: 00007a46751eb000 CR3: 0000005e65688000 CR4: 00000000000407e0 <4>[196727.311625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>[196727.311669] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>[196727.311713] Stack: <4>[196727.311733] ffff8854c398ecc0 ffff8854c398ecc0 ffff885effd23ab0 ffffffff815b7f42 <4>[196727.311784] ffff88be6595bc00 ffff8854c398ecc0 0000000000000000 ffff8854c398ecc0 <4>[196727.311834] ffff885effd23ad0 ffffffff815b86c6 ffff885d5a590800 ffff8816827821c0 <4>[196727.311885] Call Trace: <4>[196727.311907] <IRQ> <4>[196727.311912] [<ffffffff815b7f42>] dst_destroy+0x32/0xe0 <4>[196727.311959] [<ffffffff815b86c6>] dst_release+0x56/0x80 <4>[196727.311986] [<ffffffff81620bd5>] tcp_v4_do_rcv+0x2a5/0x4a0 <4>[196727.312013] [<ffffffff81622b5a>] tcp_v4_rcv+0x7da/0x820 <4>[196727.312041] [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360 <4>[196727.312070] [<ffffffff815de02d>] ? nf_hook_slow+0x7d/0x150 <4>[196727.312097] [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360 <4>[196727.312125] [<ffffffff815fda92>] ip_local_deliver_finish+0xb2/0x230 <4>[196727.312154] [<ffffffff815fdd9a>] ip_local_deliver+0x4a/0x90 <4>[196727.312183] [<ffffffff815fd799>] ip_rcv_finish+0x119/0x360 <4>[196727.312212] [<ffffffff815fe00b>] ip_rcv+0x22b/0x340 <4>[196727.312242] [<ffffffffa0339680>] ? macvlan_broadcast+0x160/0x160 [macvlan] <4>[196727.312275] [<ffffffff815b0c62>] __netif_receive_skb_core+0x512/0x640 <4>[196727.312308] [<ffffffff811427fb>] ? kmem_cache_alloc+0x13b/0x150 <4>[196727.312338] [<ffffffff815b0db1>] __netif_receive_skb+0x21/0x70 <4>[196727.312368] [<ffffffff815b0fa1>] netif_receive_skb+0x31/0xa0 <4>[196727.312397] [<ffffffff815b1ae8>] napi_gro_receive+0xe8/0x140 <4>[196727.312433] [<ffffffffa00274f1>] ixgbe_poll+0x551/0x11f0 [ixgbe] <4>[196727.312463] [<ffffffff815fe00b>] ? ip_rcv+0x22b/0x340 <4>[196727.312491] [<ffffffff815b1691>] net_rx_action+0x111/0x210 <4>[196727.312521] [<ffffffff815b0db1>] ? __netif_receive_skb+0x21/0x70 <4>[196727.312552] [<ffffffff810519d0>] __do_softirq+0xd0/0x270 <4>[196727.312583] [<ffffffff816cef3c>] call_softirq+0x1c/0x30 <4>[196727.312613] [<ffffffff81004205>] do_softirq+0x55/0x90 <4>[196727.312640] [<ffffffff81051c85>] irq_exit+0x55/0x60 <4>[196727.312668] [<ffffffff816cf5c3>] do_IRQ+0x63/0xe0 <4>[196727.312696] [<ffffffff816c5aaa>] common_interrupt+0x6a/0x6a <4>[196727.312722] <EOI> <1>[196727.313071] RIP [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80 <4>[196727.313100] RSP <ffff885effd23a70> <4>[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]--- <0>[196727.380908] Kernel panic - not syncing: Fatal exception in interrupt Reported-by: Alexey Preobrazhensky <[email protected]> Reported-by: dormando <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Fixes: 8141ed9 ("ipv4: Add a socket release callback for datagram sockets") Cc: Steffen Klassert <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit d5653f2 upstream. Fix NULL pointer exceptions when platform data is not supplied. Trace of one exception: Unable to handle kernel NULL pointer dereference at virtual address 00000008 pgd = c0004000 [00000008] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT SMP ARM Modules linked in: CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-12045-gead5dd4687a6-dirty analogdevicesinc#1628 task: eea80000 ti: eea88000 task.ti: eea88000 PC is at max77693_muic_probe+0x27c/0x528 LR is at regmap_write+0x50/0x60 pc : [<c041d1c8>] lr : [<c02eba60>] psr: 20000113 sp : eea89e38 ip : 00000000 fp : c098a834 r10: ee1a5a10 r9 : 00000005 r8 : c098a83c r7 : 0000000a r6 : c098a774 r5 : 00000005 r4 : eeb006d0 r3 : c0697bd8 r2 : 00000000 r1 : 00000001 r0 : 00000000 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c5387d Table: 4000404a DAC: 00000015 Process swapper/0 (pid: 1, stack limit = 0xeea88240) Stack: (0xeea89e38 to 0xeea8a000) 9e20: c08499fc eeb006d0 9e40: 00000000 00000000 c0915f98 00000001 00000000 ee1a5a10 c098a730 c09a88b8 9e60: 00000000 c098a730 c0915f98 00000000 00000000 c02d6aa0 c02d6a88 ee1a5a10 9e80: c0a712c8 c02d54e4 00001204 c0628b00 ee1a5a10 c098a730 ee1a5a44 00000000 9ea0: eea88000 c02d57b4 00000000 c098a730 c02d5728 c02d3a24 ee813e5c eeb9d534 9ec0: c098a730 ee22f700 c097c720 c02d4b14 c08174ec c098a730 00000006 c098a730 9ee0: 00000006 c092fd30 c09b8500 c02d5df8 00000000 c093cbb8 00000006 c0008928 9f00: 000000c3 ef7fc785 00000000 ef7fc794 00000000 c08af968 00000072 eea89f30 9f20: ef7fc85e c065f198 000000c3 c003e87c 00000003 00000000 c092fd3c 00000000 9f40: c08af618 c0826d58 00000006 00000006 c0956f58 c093cbb8 00000006 c092fd30 9f60: c09b8500 000000c3 c092fd3c c08e8510 00000000 c08e8bb0 00000006 00000006 9f80: c08e8510 c0c0c0c0 00000000 c0628fac 00000000 00000000 00000000 00000000 9fa0: 00000000 c0628fb4 00000000 c000f038 00000000 00000000 00000000 00000000 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 c0c0c0c0 c0c0c0c0 [<c041d1c8>] (max77693_muic_probe) from [<c02d6aa0>] (platform_drv_probe+0x18/0x48) [<c02d6aa0>] (platform_drv_probe) from [<c02d54e4>] (driver_probe_device+0x140/0x384) [<c02d54e4>] (driver_probe_device) from [<c02d57b4>] (__driver_attach+0x8c/0x90) [<c02d57b4>] (__driver_attach) from [<c02d3a24>] (bus_for_each_dev+0x54/0x88) [<c02d3a24>] (bus_for_each_dev) from [<c02d4b14>] (bus_add_driver+0xe8/0x204) [<c02d4b14>] (bus_add_driver) from [<c02d5df8>] (driver_register+0x78/0xf4) [<c02d5df8>] (driver_register) from [<c0008928>] (do_one_initcall+0xc4/0x174) [<c0008928>] (do_one_initcall) from [<c08e8bb0>] (kernel_init_freeable+0xfc/0x1c8) [<c08e8bb0>] (kernel_init_freeable) from [<c0628fb4>] (kernel_init+0x8/0xec) [<c0628fb4>] (kernel_init) from [<c000f038>] (ret_from_fork+0x14/0x3c) Code: caffffe7 e59d200c e3550001 b3a05001 (e5923008) ---[ end trace 85db969ce011bde7 ]--- Signed-off-by: Krzysztof Kozlowski <[email protected]> Fixes: 190d7cf Signed-off-by: Chanwoo Choi <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit e47043a upstream. The OpenBlocks AX3-4 has a non-DT bootloader. It also comes with 1GB of soldered on RAM, and a DIMM slot for expansion. Unfortunately, atags_to_fdt() doesn't work in big-endian mode, so we see the following failure when attempting to boot a big-endian kernel: 686 slab pages 17 pages shared 0 pages swap cached [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name Kernel panic - not syncing: Out of memory and no killable processes... CPU: 1 PID: 351 Comm: kworker/u4:0 Not tainted 3.15.0-rc8-next-20140603 #1 [<c0215a54>] (unwind_backtrace) from [<c021160c>] (show_stack+0x10/0x14) [<c021160c>] (show_stack) from [<c0802500>] (dump_stack+0x78/0x94) [<c0802500>] (dump_stack) from [<c0800068>] (panic+0x90/0x21c) [<c0800068>] (panic) from [<c02b5704>] (out_of_memory+0x320/0x340) [<c02b5704>] (out_of_memory) from [<c02b93a0>] (__alloc_pages_nodemask+0x874/0x930) [<c02b93a0>] (__alloc_pages_nodemask) from [<c02d446c>] (handle_mm_fault+0x744/0x96c) [<c02d446c>] (handle_mm_fault) from [<c02cf250>] (__get_user_pages+0xd0/0x4c0) [<c02cf250>] (__get_user_pages) from [<c02f3598>] (get_arg_page+0x54/0xbc) [<c02f3598>] (get_arg_page) from [<c02f3878>] (copy_strings+0x278/0x29c) [<c02f3878>] (copy_strings) from [<c02f38bc>] (copy_strings_kernel+0x20/0x28) [<c02f38bc>] (copy_strings_kernel) from [<c02f4f1c>] (do_execve+0x3a8/0x4c8) [<c02f4f1c>] (do_execve) from [<c025ac10>] (____call_usermodehelper+0x15c/0x194) [<c025ac10>] (____call_usermodehelper) from [<c020e9b8>] (ret_from_fork+0x14/0x3c) CPU0: stopping CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.15.0-rc8-next-20140603 #1 [<c0215a54>] (unwind_backtrace) from [<c021160c>] (show_stack+0x10/0x14) [<c021160c>] (show_stack) from [<c0802500>] (dump_stack+0x78/0x94) [<c0802500>] (dump_stack) from [<c021429c>] (handle_IPI+0x138/0x174) [<c021429c>] (handle_IPI) from [<c02087f0>] (armada_370_xp_handle_irq+0xb0/0xcc) [<c02087f0>] (armada_370_xp_handle_irq) from [<c0212100>] (__irq_svc+0x40/0x50) Exception stack(0xc0b6bf68 to 0xc0b6bfb0) bf60: e9fad598 00000000 00f509a3 00000000 c0b6a000 c0b724c4 bf80: c0b72458 c0b6a000 00000000 00000000 c0b66da0 c0b6a000 00000000 c0b6bfb0 bfa0: c027bb94 c027bb24 60000313 ffffffff [<c0212100>] (__irq_svc) from [<c027bb24>] (cpu_startup_entry+0x54/0x214) [<c027bb24>] (cpu_startup_entry) from [<c0ac5b30>] (start_kernel+0x318/0x37c) [<c0ac5b30>] (start_kernel) from [<00208078>] (0x208078) ---[ end Kernel panic - not syncing: Out of memory and no killable processes... A similar failure will also occur if ARM_ATAG_DTB_COMPAT isn't selected. Fix this by setting a sane default (1 GB) in the dts file. Signed-off-by: Jason Cooper <[email protected]> Tested-by: Kevin Hilman <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit ed55b6a upstream. When encountering memory pressure, testers have run into the following lockdep warning. It was caused by __link_block_group calling kobject_add with the groups_sem held. kobject_add calls kvasprintf with GFP_KERNEL, which gets us into reclaim context. The kobject doesn't actually need to be added under the lock -- it just needs to ensure that it's only added for the first block group to be linked. ========================================================= [ INFO: possible irq lock inversion dependency detected ] 3.14.0-rc8-default #1 Not tainted --------------------------------------------------------- kswapd0/169 just changed the state of lock: (&delayed_node->mutex){+.+.-.}, at: [<ffffffffa018baea>] __btrfs_release_delayed_node+0x3a/0x200 [btrfs] but this lock took another, RECLAIM_FS-unsafe lock in the past: (&found->groups_sem){+++++.} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&found->groups_sem); local_irq_disable(); lock(&delayed_node->mutex); lock(&found->groups_sem); <Interrupt> lock(&delayed_node->mutex); *** DEADLOCK *** 2 locks held by kswapd0/169: #0: (shrinker_rwsem){++++..}, at: [<ffffffff81159e8a>] shrink_slab+0x3a/0x160 #1: (&type->s_umount_key#27){++++..}, at: [<ffffffff811bac6f>] grab_super_passive+0x3f/0x90 Signed-off-by: Jeff Mahoney <[email protected]> Signed-off-by: Chris Mason <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit 72abc8f upstream. I hit the same assert failed as Dolev Raviv reported in Kernel v3.10 shows like this: [ 9641.164028] UBIFS assert failed in shrink_tnc at 131 (pid 13297) [ 9641.234078] CPU: 1 PID: 13297 Comm: mmap.test Tainted: G O 3.10.40 #1 [ 9641.234116] [<c0011a6c>] (unwind_backtrace+0x0/0x12c) from [<c000d0b0>] (show_stack+0x20/0x24) [ 9641.234137] [<c000d0b0>] (show_stack+0x20/0x24) from [<c0311134>] (dump_stack+0x20/0x28) [ 9641.234188] [<c0311134>] (dump_stack+0x20/0x28) from [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs]) [ 9641.234265] [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs]) from [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs]) [ 9641.234307] [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs]) from [<c00cdad8>] (shrink_slab+0x1d4/0x2f8) [ 9641.234327] [<c00cdad8>] (shrink_slab+0x1d4/0x2f8) from [<c00d03d0>] (do_try_to_free_pages+0x300/0x544) [ 9641.234344] [<c00d03d0>] (do_try_to_free_pages+0x300/0x544) from [<c00d0a44>] (try_to_free_pages+0x2d0/0x398) [ 9641.234363] [<c00d0a44>] (try_to_free_pages+0x2d0/0x398) from [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8) [ 9641.234382] [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8) from [<c00f62d8>] (new_slab+0x78/0x238) [ 9641.234400] [<c00f62d8>] (new_slab+0x78/0x238) from [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c) [ 9641.234419] [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c) from [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188) [ 9641.234459] [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188) from [<bf227908>] (do_readpage+0x168/0x468 [ubifs]) [ 9641.234553] [<bf227908>] (do_readpage+0x168/0x468 [ubifs]) from [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs]) [ 9641.234606] [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs]) from [<c00c17c0>] (filemap_fault+0x304/0x418) [ 9641.234638] [<c00c17c0>] (filemap_fault+0x304/0x418) from [<c00de694>] (__do_fault+0xd4/0x530) [ 9641.234665] [<c00de694>] (__do_fault+0xd4/0x530) from [<c00e10c0>] (handle_pte_fault+0x480/0xf54) [ 9641.234690] [<c00e10c0>] (handle_pte_fault+0x480/0xf54) from [<c00e2bf8>] (handle_mm_fault+0x140/0x184) [ 9641.234716] [<c00e2bf8>] (handle_mm_fault+0x140/0x184) from [<c0316688>] (do_page_fault+0x150/0x3ac) [ 9641.234737] [<c0316688>] (do_page_fault+0x150/0x3ac) from [<c000842c>] (do_DataAbort+0x3c/0xa0) [ 9641.234759] [<c000842c>] (do_DataAbort+0x3c/0xa0) from [<c0314e38>] (__dabt_usr+0x38/0x40) After analyzing the code, I found a condition that may cause this failed in correct operations. Thus, I think this assertion is wrong and should be removed. Suppose there are two clean znodes and one dirty znode in TNC. So the per-filesystem atomic_t @clean_zn_cnt is (2). If commit start, dirty_znode is set to COW_ZNODE in get_znodes_to_commit() in case of potentially ops on this znode. We clear COW bit and DIRTY bit in write_index() without @tnc_mutex locked. We don't increase @clean_zn_cnt in this place. As the comments in write_index() shows, if another process hold @tnc_mutex and dirty this znode after we clean it, @clean_zn_cnt would be decreased to (1). We will increase @clean_zn_cnt to (2) with @tnc_mutex locked in free_obsolete_znodes() to keep it right. If shrink_tnc() performs between decrease and increase, it will release other 2 clean znodes it holds and found @clean_zn_cnt is less than zero (1 - 2 = -1), then hit the assertion. Because free_obsolete_znodes() will soon correct @clean_zn_cnt and no harm to fs in this case, I think this assertion could be removed. 2 clean zondes and 1 dirty znode, @clean_zn_cnt == 2 Thread A (commit) Thread B (write or others) Thread C (shrinker) ->write_index ->clear_bit(DIRTY_NODE) ->clear_bit(COW_ZNODE) @clean_zn_cnt == 2 ->mutex_locked(&tnc_mutex) ->dirty_cow_znode ->!ubifs_zn_cow(znode) ->!test_and_set_bit(DIRTY_NODE) ->atomic_dec(&clean_zn_cnt) ->mutex_unlocked(&tnc_mutex) @clean_zn_cnt == 1 ->mutex_locked(&tnc_mutex) ->shrink_tnc ->destroy_tnc_subtree ->atomic_sub(&clean_zn_cnt, 2) ->ubifs_assert <- hit ->mutex_unlocked(&tnc_mutex) @clean_zn_cnt == -1 ->mutex_lock(&tnc_mutex) ->free_obsolete_znodes ->atomic_inc(&clean_zn_cnt) ->mutux_unlock(&tnc_mutex) @clean_zn_cnt == 0 (correct after shrink) Signed-off-by: hujianyang <[email protected]> Signed-off-by: Artem Bityutskiy <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
…kup detector commit bde92cf upstream. Peter Wu noticed the following splat on his machine when updating /proc/sys/kernel/watchdog_thresh: BUG: sleeping function called from invalid context at mm/slub.c:965 in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init 3 locks held by init/1: #0: (sb_writers#3){.+.+.+}, at: [<ffffffff8117b663>] vfs_write+0x143/0x180 #1: (watchdog_proc_mutex){+.+.+.}, at: [<ffffffff810e02d3>] proc_dowatchdog+0x33/0x110 #2: (cpu_hotplug.lock){.+.+.+}, at: [<ffffffff810589c2>] get_online_cpus+0x32/0x80 Preemption disabled at:[<ffffffff810e0384>] proc_dowatchdog+0xe4/0x110 CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-testing analogdevicesinc#34 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x4e/0x7a __might_sleep+0x11d/0x190 kmem_cache_alloc_trace+0x4e/0x1e0 perf_event_alloc+0x55/0x440 perf_event_create_kernel_counter+0x26/0xe0 watchdog_nmi_enable+0x75/0x140 update_timers_all_cpus+0x53/0xa0 proc_dowatchdog+0xe4/0x110 proc_sys_call_handler+0xb3/0xc0 proc_sys_write+0x14/0x20 vfs_write+0xad/0x180 SyS_write+0x49/0xb0 system_call_fastpath+0x16/0x1b NMI watchdog: disabled (cpu0): hardware events not enabled What happened is after updating the watchdog_thresh, the lockup detector is restarted to utilize the new value. Part of this process involved disabling preemption. Once preemption was disabled, perf tried to allocate a new event (as part of the restart). This caused the above BUG_ON as you can't sleep with preemption disabled. The preemption restriction seemed agressive as we are not doing anything on that particular cpu, but with all the online cpus (which are protected by the get_online_cpus lock). Remove the restriction and the BUG_ON goes away. Signed-off-by: Don Zickus <[email protected]> Acked-by: Michal Hocko <[email protected]> Reported-by: Peter Wu <[email protected]> Tested-by: Peter Wu <[email protected]> Acked-by: David Rientjes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit 60e1751 upstream. Avoid that closing /dev/infiniband/umad<n> or /dev/infiniband/issm<n> triggers a use-after-free. __fput() invokes f_op->release() before it invokes cdev_put(). Make sure that the ib_umad_device structure is freed by the cdev_put() call instead of f_op->release(). This avoids that changing the port mode from IB into Ethernet and back to IB followed by restarting opensmd triggers the following kernel oops: general protection fault: 0000 [#1] PREEMPT SMP RIP: 0010:[<ffffffff810cc65c>] [<ffffffff810cc65c>] module_put+0x2c/0x170 Call Trace: [<ffffffff81190f20>] cdev_put+0x20/0x30 [<ffffffff8118e2ce>] __fput+0x1ae/0x1f0 [<ffffffff8118e35e>] ____fput+0xe/0x10 [<ffffffff810723bc>] task_work_run+0xac/0xe0 [<ffffffff81002a9f>] do_notify_resume+0x9f/0xc0 [<ffffffff814b8398>] int_signal+0x12/0x17 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=75051 Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Yann Droneaud <[email protected]> Signed-off-by: Roland Dreier <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit 7adb5c8 upstream. At probe time, the musb_am335x driver register its childs by calling of_platform_populate(), which registers all childs in the devicetree hierarchy recursively. On the other side, the driver's remove() function uses of_device_unregister() to remove each child of musb_am335x's. However, when musb_dsps is loaded, its devices are attached to the musb_am335x device as musb_am335x childs. Hence, musb_am335x remove() will attempt to unregister the devices registered by musb_dsps, which produces a kernel panic. In other words, the childs in the "struct device" hierarchy are not the same as the childs in the "devicetree" hierarchy. Ideally, we should enforce the removal of the devices registered by musb_am335x *only*, instead of all its child devices. However, because of the recursive nature of of_platform_populate, this doesn't seem possible. Therefore, as the only solution at hand, this commit disables musb_am335x driver removal capability, preventing it from being ever removed. This was originally suggested by Sebastian Siewior: https://www.mail-archive.com/[email protected]/msg104946.html And for reference, here's the panic upon module removal: musb-hdrc musb-hdrc.0.auto: remove, state 4 usb usb1: USB disconnect, device number 1 musb-hdrc musb-hdrc.0.auto: USB bus 1 deregistered Unable to handle kernel NULL pointer dereference at virtual address 0000008c pgd = de11c000 [0000008c] *pgd=9e174831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] ARM Modules linked in: musb_am335x(-) musb_dsps musb_hdrc usbcore usb_common CPU: 0 PID: 623 Comm: modprobe Not tainted 3.15.0-rc4-00001-g24efd13 analogdevicesinc#69 task: de1b7500 ti: de122000 task.ti: de122000 PC is at am335x_shutdown+0x10/0x28 LR is at am335x_shutdown+0xc/0x28 pc : [<c0327798>] lr : [<c0327794>] psr: a0000013 sp : de123df8 ip : 00000004 fp : 00028f00 r10: 00000000 r9 : de122000 r8 : c000e6c4 r7 : de0e3c10 r6 : de0e3800 r5 : de624010 r4 : de1ec750 r3 : de0e3810 r2 : 00000000 r1 : 00000001 r0 : 00000000 Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: 9e11c019 DAC: 00000015 Process modprobe (pid: 623, stack limit = 0xde122240) Stack: (0xde123df8 to 0xde124000) 3de0: de0e3810 bf054488 3e00: bf05444c de624010 60000013 bf043650 000012fc de624010 de0e3810 bf043a20 3e20: de0e3810 bf04b240 c0635b88 c02ca37c c02ca364 c02c8db0 de1b7500 de0e3844 3e40: de0e3810 c02c8e28 c0635b88 de02824c de0e3810 c02c884c de0e3800 de0e3810 3e60: de0e3818 c02c5b20 bf05417c de0e3800 de0e3800 c0635b88 de0f2410 c02ca838 3e80: bf05417c de0e3800 bf055438 c02ca8cc de0e3c10 bf054194 de0e3c10 c02ca37c 3ea0: c02ca364 c02c8db0 de1b7500 de0e3c44 de0e3c10 c02c8e28 c0635b88 de02824c 3ec0: de0e3c10 c02c884c de0e3c10 de0e3c10 de0e3c18 c02c5b20 de0e3c10 de0e3c10 3ee0: 00000000 bf059000 a0000013 c02c5bc0 00000000 bf05900c de0e3c10 c02c5c48 3f00: de0dd0c0 de1ec970 de0f2410 bf05929c de0f2444 bf05902c de0f2410 c02ca37c 3f20: c02ca364 c02c8db0 bf05929c de0f2410 bf05929c c02c94c8 bf05929c 00000000 3f40: 00000800 c02c8ab4 bf0592e0 c007fc40 c00dd820 6273756d 336d615f 00783533 3f60: c064a0ac de1b7500 de122000 de1b7500 c000e590 00000001 c000e6c4 c0060160 3f80: 00028e70 00028e70 00028ea4 00000081 60000010 00028e70 00028e70 00028ea4 3fa0: 00000081 c000e500 00028e70 00028e70 00028ea4 00000800 becb59f8 00027608 3fc0: 00028e70 00028e70 00028ea4 00000081 00000001 00000001 00000000 00028f00 3fe0: b6e6b6f0 becb59d4 000160e8 b6e6b6fc 60000010 00028ea4 00000000 00000000 [<c0327798>] (am335x_shutdown) from [<bf054488>] (dsps_musb_exit+0x3c/0x4c [musb_dsps]) [<bf054488>] (dsps_musb_exit [musb_dsps]) from [<bf043650>] (musb_shutdown+0x80/0x90 [musb_hdrc]) [<bf043650>] (musb_shutdown [musb_hdrc]) from [<bf043a20>] (musb_remove+0x24/0x68 [musb_hdrc]) [<bf043a20>] (musb_remove [musb_hdrc]) from [<c02ca37c>] (platform_drv_remove+0x18/0x1c) [<c02ca37c>] (platform_drv_remove) from [<c02c8db0>] (__device_release_driver+0x70/0xc8) [<c02c8db0>] (__device_release_driver) from [<c02c8e28>] (device_release_driver+0x20/0x2c) [<c02c8e28>] (device_release_driver) from [<c02c884c>] (bus_remove_device+0xdc/0x10c) [<c02c884c>] (bus_remove_device) from [<c02c5b20>] (device_del+0x104/0x198) [<c02c5b20>] (device_del) from [<c02ca838>] (platform_device_del+0x14/0x9c) [<c02ca838>] (platform_device_del) from [<c02ca8cc>] (platform_device_unregister+0xc/0x20) [<c02ca8cc>] (platform_device_unregister) from [<bf054194>] (dsps_remove+0x18/0x38 [musb_dsps]) [<bf054194>] (dsps_remove [musb_dsps]) from [<c02ca37c>] (platform_drv_remove+0x18/0x1c) [<c02ca37c>] (platform_drv_remove) from [<c02c8db0>] (__device_release_driver+0x70/0xc8) [<c02c8db0>] (__device_release_driver) from [<c02c8e28>] (device_release_driver+0x20/0x2c) [<c02c8e28>] (device_release_driver) from [<c02c884c>] (bus_remove_device+0xdc/0x10c) [<c02c884c>] (bus_remove_device) from [<c02c5b20>] (device_del+0x104/0x198) [<c02c5b20>] (device_del) from [<c02c5bc0>] (device_unregister+0xc/0x20) [<c02c5bc0>] (device_unregister) from [<bf05900c>] (of_remove_populated_child+0xc/0x14 [musb_am335x]) [<bf05900c>] (of_remove_populated_child [musb_am335x]) from [<c02c5c48>] (device_for_each_child+0x44/0x70) [<c02c5c48>] (device_for_each_child) from [<bf05902c>] (am335x_child_remove+0x18/0x30 [musb_am335x]) [<bf05902c>] (am335x_child_remove [musb_am335x]) from [<c02ca37c>] (platform_drv_remove+0x18/0x1c) [<c02ca37c>] (platform_drv_remove) from [<c02c8db0>] (__device_release_driver+0x70/0xc8) [<c02c8db0>] (__device_release_driver) from [<c02c94c8>] (driver_detach+0xb4/0xb8) [<c02c94c8>] (driver_detach) from [<c02c8ab4>] (bus_remove_driver+0x4c/0xa0) [<c02c8ab4>] (bus_remove_driver) from [<c007fc40>] (SyS_delete_module+0x128/0x1cc) [<c007fc40>] (SyS_delete_module) from [<c000e500>] (ret_fast_syscall+0x0/0x48) Fixes: 97238b3 ("usb: musb: dsps: use proper child nodes") Acked-by: George Cherian <[email protected]> Signed-off-by: Ezequiel Garcia <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit e4adcff upstream. We need to delete un-finished td from current request's td list at ep_dequeue API, otherwise, this non-user td will be remained at td list before this request is freed. So if we do ep_queue-> ep_dequeue->ep_queue sequence, when the complete interrupt for the second ep_queue comes, we search td list for this request, the first td (added by the first ep_queue) will be handled, and its status is still active, so we will consider the this transfer still not be completed, but in fact, it has completed. It causes the peripheral side considers it never receives current data for this transfer. We met this problem when do "Error Recovery Test - Device Configured" test item for USBCV2 MSC test, the host has never received ACK for the IN token for CSW due to peripheral considers it does not get this CBW, the USBCV test log like belows: -------------------------------------------------------------------------- INFO Issuing BOT MSC Reset, reset should always succeed INFO Retrieving status on CBW endpoint INFO CBW endpoint status = 0x0 INFO Retrieving status on CSW endpoint INFO CSW endpoint status = 0x0 INFO Issuing required command (Test Unit Ready) to verify device has recovered INFO Issuing CBW (attempt #1): INFO |----- CBW LUN = 0x0 INFO |----- CBW Flags = 0x0 INFO |----- CBW Data Transfer Length = 0x0 INFO |----- CBW CDB Length = 0x6 INFO |----- CBW CDB-00 = 0x0 INFO |----- CBW CDB-01 = 0x0 INFO |----- CBW CDB-02 = 0x0 INFO |----- CBW CDB-03 = 0x0 INFO |----- CBW CDB-04 = 0x0 INFO |----- CBW CDB-05 = 0x0 INFO Issuing CSW : try 1 INFO CSW Bulk Request timed out! ERROR Failed CSW phase : should have been success or stall FAIL (5.3.4) The CSW status value must be 0x00, 0x01, or 0x02. ERROR BOTCommonMSCRequest failed: error=80004000 Cc: Andrzej Pietrasiewicz <[email protected]> Signed-off-by: Peter Chen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit 7cd2b0a upstream. Oleg reports a division by zero error on zero-length write() to the percpu_pagelist_fraction sysctl: divide error: 0000 [#1] SMP DEBUG_PAGEALLOC CPU: 1 PID: 9142 Comm: badarea_io Not tainted 3.15.0-rc2-vm-nfs+ analogdevicesinc#19 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff8800d5aeb6e0 ti: ffff8800d87a2000 task.ti: ffff8800d87a2000 RIP: 0010: percpu_pagelist_fraction_sysctl_handler+0x84/0x120 RSP: 0018:ffff8800d87a3e78 EFLAGS: 00010246 RAX: 0000000000000f89 RBX: ffff88011f7fd000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000010 RBP: ffff8800d87a3e98 R08: ffffffff81d002c8 R09: ffff8800d87a3f50 R10: 000000000000000b R11: 0000000000000246 R12: 0000000000000060 R13: ffffffff81c3c3e0 R14: ffffffff81cfddf8 R15: ffff8801193b0800 FS: 00007f614f1e9740(0000) GS:ffff88011f440000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f614f1fa000 CR3: 00000000d9291000 CR4: 00000000000006e0 Call Trace: proc_sys_call_handler+0xb3/0xc0 proc_sys_write+0x14/0x20 vfs_write+0xba/0x1e0 SyS_write+0x46/0xb0 tracesys+0xe1/0xe6 However, if the percpu_pagelist_fraction sysctl is set by the user, it is also impossible to restore it to the kernel default since the user cannot write 0 to the sysctl. This patch allows the user to write 0 to restore the default behavior. It still requires a fraction equal to or larger than 8, however, as stated by the documentation for sanity. If a value in the range [1, 7] is written, the sysctl will return EINVAL. This successfully solves the divide by zero issue at the same time. Signed-off-by: David Rientjes <[email protected]> Reported-by: Oleg Drokin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
…refcnt an atomic_t commit a5049a8 upstream. Hello, So, this patch should do. Joe, Vivek, can one of you guys please verify that the oops goes away with this patch? Jens, the original thread can be read at http://thread.gmane.org/gmane.linux.kernel/1720729 The fix converts blkg->refcnt from int to atomic_t. It does some overhead but it should be minute compared to everything else which is going on and the involved cacheline bouncing, so I think it's highly unlikely to cause any noticeable difference. Also, the refcnt in question should be converted to a perpcu_ref for blk-mq anyway, so the atomic_t is likely to go away pretty soon anyway. Thanks. ------- 8< ------- __blkg_release_rcu() may be invoked after the associated request_queue is released with a RCU grace period inbetween. As such, the function and callbacks invoked from it must not dereference the associated request_queue. This is clearly indicated in the comment above the function. Unfortunately, while trying to fix a different issue, 2a4fd07 ("blkcg: move bulk of blkcg_gq release operations to the RCU callback") ignored this and added [un]locking of @blkg->q->queue_lock to __blkg_release_rcu(). This of course can cause oops as the request_queue may be long gone by the time this code gets executed. general protection fault: 0000 [#1] SMP CPU: 21 PID: 30 Comm: rcuos/21 Not tainted 3.15.0 #1 Hardware name: Stratus ftServer 6400/G7LAZ, BIOS BIOS Version 6.3:57 12/25/2013 task: ffff880854021de0 ti: ffff88085403c000 task.ti: ffff88085403c000 RIP: 0010:[<ffffffff8162e9e5>] [<ffffffff8162e9e5>] _raw_spin_lock_irq+0x15/0x60 RSP: 0018:ffff88085403fdf0 EFLAGS: 00010086 RAX: 0000000000020000 RBX: 0000000000000010 RCX: 0000000000000000 RDX: 000060ef80008248 RSI: 0000000000000286 RDI: 6b6b6b6b6b6b6b6b RBP: ffff88085403fdf0 R08: 0000000000000286 R09: 0000000000009f39 R10: 0000000000020001 R11: 0000000000020001 R12: ffff88103c17a130 R13: ffff88103c17a080 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88107fca0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000006e5ab8 CR3: 000000000193d000 CR4: 00000000000407e0 Stack: ffff88085403fe18 ffffffff812cbfc2 ffff88103c17a130 0000000000000000 ffff88103c17a130 ffff88085403fec0 ffffffff810d1d28 ffff880854021de0 ffff880854021de0 ffff88107fcaec58 ffff88085403fe80 ffff88107fcaec30 Call Trace: [<ffffffff812cbfc2>] __blkg_release_rcu+0x72/0x150 [<ffffffff810d1d28>] rcu_nocb_kthread+0x1e8/0x300 [<ffffffff81091d81>] kthread+0xe1/0x100 [<ffffffff8163813c>] ret_from_fork+0x7c/0xb0 Code: ff 47 04 48 8b 7d 08 be 00 02 00 00 e8 55 48 a4 ff 5d c3 0f 1f 00 66 66 66 66 90 55 48 89 e5 +fa 66 66 90 66 66 90 b8 00 00 02 00 <f0> 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f +b7 RIP [<ffffffff8162e9e5>] _raw_spin_lock_irq+0x15/0x60 RSP <ffff88085403fdf0> The request_queue locking was added because blkcg_gq->refcnt is an int protected with the queue lock and __blkg_release_rcu() needs to put the parent. Let's fix it by making blkcg_gq->refcnt an atomic_t and dropping queue locking in the function. Given the general heavy weight of the current request_queue and blkcg operations, this is unlikely to cause any noticeable overhead. Moreover, blkcg_gq->refcnt is likely to be converted to percpu_ref in the near future, so whatever (most likely negligible) overhead it may add is temporary. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Joe Lawrence <[email protected]> Acked-by: Vivek Goyal <[email protected]> Link: http://lkml.kernel.org/g/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit dc78327 upstream. With a kernel configured with ARM64_64K_PAGES && !TRANSPARENT_HUGEPAGE, the following is triggered at early boot: SMP: Total of 8 processors activated. devtmpfs: initialized Unable to handle kernel NULL pointer dereference at virtual address 00000008 pgd = fffffe0000050000 [00000008] *pgd=00000043fba00003, *pmd=00000043fba00003, *pte=00e0000078010407 Internal error: Oops: 96000006 [#1] SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.15.0-rc864k+ analogdevicesinc#44 task: fffffe03bc040000 ti: fffffe03bc080000 task.ti: fffffe03bc080000 PC is at __list_add+0x10/0xd4 LR is at free_one_page+0x270/0x638 ... Call trace: __list_add+0x10/0xd4 free_one_page+0x26c/0x638 __free_pages_ok.part.52+0x84/0xbc __free_pages+0x74/0xbc init_cma_reserved_pageblock+0xe8/0x104 cma_init_reserved_areas+0x190/0x1e4 do_one_initcall+0xc4/0x154 kernel_init_freeable+0x204/0x2a8 kernel_init+0xc/0xd4 This happens because init_cma_reserved_pageblock() calls __free_one_page() with pageblock_order as page order but it is bigger than MAX_ORDER. This in turn causes accesses past zone->free_list[]. Fix the problem by changing init_cma_reserved_pageblock() such that it splits pageblock into individual MAX_ORDER pages if pageblock is bigger than a MAX_ORDER page. In cases where !CONFIG_HUGETLB_PAGE_SIZE_VARIABLE, which is all architectures expect for ia64, powerpc and tile at the moment, the �pageblock_order > MAX_ORDER� condition will be optimised out since both sides of the operator are constants. In cases where pageblock size is variable, the performance degradation should not be significant anyway since init_cma_reserved_pageblock() is called only at boot time at most MAX_CMA_AREAS times which by default is eight. Signed-off-by: Michal Nazarewicz <[email protected]> Reported-by: Mark Salter <[email protected]> Tested-by: Mark Salter <[email protected]> Tested-by: Christopher Covington <[email protected]> Cc: Mel Gorman <[email protected]> Cc: David Rientjes <[email protected]> Cc: Marek Szyprowski <[email protected]> Cc: Catalin Marinas <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
aolofsson
pushed a commit
that referenced
this pull request
Aug 19, 2014
commit 0378730 upstream. Commit b1cb098 ("change the management method of free objects of the slab") introduced a bug on slab leak detector ('/proc/slab_allocators'). This detector works like as following decription. 1. traverse all objects on all the slabs. 2. determine whether it is active or not. 3. if active, print who allocate this object. but that commit changed the way how to manage free objects, so the logic determining whether it is active or not is also changed. In before, we regard object in cpu caches as inactive one, but, with this commit, we mistakenly regard object in cpu caches as active one. This intoduces kernel oops if DEBUG_PAGEALLOC is enabled. If DEBUG_PAGEALLOC is enabled, kernel_map_pages() is used to detect who corrupt free memory in the slab. It unmaps page table mapping if object is free and map it if object is active. When slab leak detector check object in cpu caches, it mistakenly think this object active so try to access object memory to retrieve caller of allocation. At this point, page table mapping to this object doesn't exist, so oops occurs. Following is oops message reported from Dave. It blew up when something tried to read /proc/slab_allocators (Just cat it, and you should see the oops below) Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: [snip...] CPU: 1 PID: 9386 Comm: trinity-c33 Not tainted 3.14.0-rc5+ analogdevicesinc#131 task: ffff8801aa46e890 ti: ffff880076924000 task.ti: ffff880076924000 RIP: 0010:[<ffffffffaa1a8f4a>] [<ffffffffaa1a8f4a>] handle_slab+0x8a/0x180 RSP: 0018:ffff880076925de0 EFLAGS: 00010002 RAX: 0000000000001000 RBX: 0000000000000000 RCX: 000000005ce85ce7 RDX: ffffea00079be100 RSI: 0000000000001000 RDI: ffff880107458000 RBP: ffff880076925e18 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 000000000000000f R12: ffff8801e6f84000 R13: ffffea00079be100 R14: ffff880107458000 R15: ffff88022bb8d2c0 FS: 00007fb769e45740(0000) GS:ffff88024d040000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff8801e6f84ff8 CR3: 00000000a22db000 CR4: 00000000001407e0 DR0: 0000000002695000 DR1: 0000000002695000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000070602 Call Trace: leaks_show+0xce/0x240 seq_read+0x28e/0x490 proc_reg_read+0x3d/0x80 vfs_read+0x9b/0x160 SyS_read+0x58/0xb0 tracesys+0xd4/0xd9 Code: f5 00 00 00 0f 1f 44 00 00 48 63 c8 44 3b 0c 8a 0f 84 e3 00 00 00 83 c0 01 44 39 c0 72 eb 41 f6 47 1a 01 0f 84 e9 00 00 00 89 f0 <4d> 8b 4c 04 f8 4d 85 c9 0f 84 88 00 00 00 49 8b 7e 08 4d 8d 46 RIP handle_slab+0x8a/0x180 To fix the problem, I introduce an object status buffer on each slab. With this, we can track object status precisely, so slab leak detector would not access active object and no kernel oops would occur. Memory overhead caused by this fix is only imposed to CONFIG_DEBUG_SLAB_LEAK which is mainly used for debugging, so memory overhead isn't big problem. Signed-off-by: Joonsoo Kim <[email protected]> Reported-by: Dave Jones <[email protected]> Reported-by: Tetsuo Handa <[email protected]> Reviewed-by: Vladimir Davydov <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
mapping->host can be NULL and shouldn't be dereferenced before being checked. [ 1295.741844] GPF could be caused by NULL-ptr deref or user memory accessgeneral protection fault: 0000 [parallella#1] SMP KASAN [ 1295.746387] Dumping ftrace buffer: [ 1295.748217] (ftrace buffer empty) [ 1295.749527] Modules linked in: [ 1295.750268] CPU: 62 PID: 23410 Comm: trinity-c70 Not tainted 3.19.0-next-20150219-sasha-00045-g9130270f analogdevicesinc#1939 [ 1295.750268] task: ffff8803a49db000 ti: ffff8803a4dc8000 task.ti: ffff8803a4dc8000 [ 1295.750268] RIP: shmem_mapping (mm/shmem.c:1458) [ 1295.750268] RSP: 0000:ffff8803a4dcfbf8 EFLAGS: 00010206 [ 1295.750268] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 00000000000f2804 [ 1295.750268] RDX: 0000000000000005 RSI: 0400000000000794 RDI: 0000000000000028 [ 1295.750268] RBP: ffff8803a4dcfc08 R08: 0000000000000000 R09: 00000000031de000 [ 1295.750268] R10: dffffc0000000000 R11: 00000000031c1000 R12: 0400000000000794 [ 1295.750268] R13: 00000000031c2000 R14: 00000000031de000 R15: ffff880e3bdc1000 [ 1295.750268] FS: 00007f8703c7e700(0000) GS:ffff881164800000(0000) knlGS:0000000000000000 [ 1295.750268] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1295.750268] CR2: 0000000004e58000 CR3: 00000003a9f3c000 CR4: 00000000000007a0 [ 1295.750268] DR0: ffffffff81000000 DR1: 0000009494949494 DR2: 0000000000000000 [ 1295.750268] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000d0602 [ 1295.750268] Stack: [ 1295.750268] ffff8803a4dcfec8 ffffffffbb1dc770 ffff8803a4dcfc38 ffffffffad6f230b [ 1295.750268] ffffffffad6f2b0d 0000014100000000 ffff88001e17c08b ffff880d9453fe08 [ 1295.750268] ffff8803a4dcfd18 ffffffffad6f2ce2 ffff8803a49dbcd8 ffff8803a49dbce0 [ 1295.750268] Call Trace: [ 1295.750268] mincore_page (mm/mincore.c:61) [ 1295.750268] ? mincore_pte_range (include/linux/spinlock.h:312 mm/mincore.c:131) [ 1295.750268] mincore_pte_range (mm/mincore.c:151) [ 1295.750268] ? mincore_unmapped_range (mm/mincore.c:113) [ 1295.750268] __walk_page_range (mm/pagewalk.c:51 mm/pagewalk.c:90 mm/pagewalk.c:116 mm/pagewalk.c:204) [ 1295.750268] walk_page_range (mm/pagewalk.c:275) [ 1295.750268] SyS_mincore (mm/mincore.c:191 mm/mincore.c:253 mm/mincore.c:220) [ 1295.750268] ? mincore_pte_range (mm/mincore.c:220) [ 1295.750268] ? mincore_unmapped_range (mm/mincore.c:113) [ 1295.750268] ? __mincore_unmapped_range (mm/mincore.c:105) [ 1295.750268] ? ptlock_free (mm/mincore.c:24) [ 1295.750268] ? syscall_trace_enter (arch/x86/kernel/ptrace.c:1610) [ 1295.750268] ia32_do_call (arch/x86/ia32/ia32entry.S:446) [ 1295.750268] Code: e5 48 c1 ea 03 53 48 89 fb 48 83 ec 08 80 3c 02 00 75 4f 48 b8 00 00 00 00 00 fc ff df 48 8b 1b 48 8d 7b 28 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 3f 48 b8 00 00 00 00 00 fc ff df 48 8b 5b 28 48 All code ======== 0: e5 48 in $0x48,%eax 2: c1 ea 03 shr $0x3,%edx 5: 53 push %rbx 6: 48 89 fb mov %rdi,%rbx 9: 48 83 ec 08 sub $0x8,%rsp d: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) 11: 75 4f jne 0x62 13: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 1a: fc ff df 1d: 48 8b 1b mov (%rbx),%rbx 20: 48 8d 7b 28 lea 0x28(%rbx),%rdi 24: 48 89 fa mov %rdi,%rdx 27: 48 c1 ea 03 shr $0x3,%rdx 2b:* 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction 2f: 75 3f jne 0x70 31: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 38: fc ff df 3b: 48 8b 5b 28 mov 0x28(%rbx),%rbx 3f: 48 rex.W ... Code starting with the faulting instruction =========================================== 0: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) 4: 75 3f jne 0x45 6: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax d: fc ff df 10: 48 8b 5b 28 mov 0x28(%rbx),%rbx 14: 48 rex.W ... [ 1295.750268] RIP shmem_mapping (mm/shmem.c:1458) [ 1295.750268] RSP <ffff8803a4dcfbf8> Fixes: 97b713b ("fs: kill BDI_CAP_SWAP_BACKED") Signed-off-by: Sasha Levin <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
As the devicetree binding doesn't require num_cs to exist or be strictly positive, and neither does the platform data case, a bug appear when num_cs is set to 0 and panics the kernel. The issue is that in alloc_nand_resource(), chip is dereferenced without having a value assigned when num_cs == 0. Fix this by returning ENODEV is num_cs == 0. The panic seen is : Unable to handle kernel NULL pointer dereference at virtual address 000002b8 pgd = c0004000 [000002b8] *pgd=00000000 Internal error: Oops: 5 [parallella#1] PREEMPT ARM Modules linked in: Hardware name: Marvell PXA3xx (Device Tree Support) task: c3822aa0 ti: c3826000 task.ti: c3826000 PC is at alloc_nand_resource+0x180/0x4a8 LR is at alloc_nand_resource+0xa0/0x4a8 pc : [<c0275b90>] lr : [<c0275ab0>] psr: 68000013 sp : c3827d90 ip : 00000000 fp : 00000000 r10: c3862200 r9 : 0000005e r8 : 00000000 r7 : c3865610 r6 : c3862210 r5 : c3924210 r4 : c3862200 r3 : 00000000 r2 : 00000000 r1 : 00000000 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 0000397f Table: 80004018 DAC: 00000035 Process swapper (pid: 1, stack limit = 0xc3826198) Stack: (0xc3827d90 to 0xc3828000) ...zip... [<c0275b90>] (alloc_nand_resource) from [<c0275ff8>] (pxa3xx_nand_probe+0x140/0x978) [<c0275ff8>] (pxa3xx_nand_probe) from [<c0258c40>] (platform_drv_probe+0x48/0xa4) [<c0258c40>] (platform_drv_probe) from [<c0257650>] (driver_probe_device+0x80/0x21c) [<c0257650>] (driver_probe_device) from [<c0257878>] (__driver_attach+0x8c/0x90) [<c0257878>] (__driver_attach) from [<c0255ec4>] (bus_for_each_dev+0x58/0x88) [<c0255ec4>] (bus_for_each_dev) from [<c0256ec8>] (bus_add_driver+0xd8/0x1d4) [<c0256ec8>] (bus_add_driver) from [<c0257f14>] (driver_register+0x78/0xf4) [<c0257f14>] (driver_register) from [<c00088a8>] (do_one_initcall+0x80/0x1e4) [<c00088a8>] (do_one_initcall) from [<c048ed08>] (kernel_init_freeable+0xec/0x1b4) [<c048ed08>] (kernel_init_freeable) from [<c0377d8c>] (kernel_init+0x8/0xe4) [<c0377d8c>] (kernel_init) from [<c00095f8>] (ret_from_fork+0x14/0x3c) Code: e503b234 e5953008 e1530001 caffffd1 (e59002b8) ---[ end trace a5770060c8441895 ]--- Signed-off-by: Robert Jarzmik <[email protected]> Acked-by: Ezequiel Garcia <[email protected]> Signed-off-by: Brian Norris <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
We did a failed attempt in the past to only use rcu in rtnl dump operations (commit e67f88d "net: dont hold rtnl mutex during netlink dump callbacks") Now that dumps are holding RTNL anyway, there is no need to also use rcu locking, as it forbids any scheduling ability, like GFP_KERNEL allocations that controlling path should use instead of GFP_ATOMIC whenever possible. This should fix following splat Cong Wang reported : [ INFO: suspicious RCU usage. ] 3.19.0+ analogdevicesinc#805 Tainted: G W include/linux/rcupdate.h:538 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by ip/771: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff8182b8f4>] netlink_dump+0x21/0x26c parallella#1: (rcu_read_lock){......}, at: [<ffffffff817d785b>] rcu_read_lock+0x0/0x6e stack backtrace: CPU: 3 PID: 771 Comm: ip Tainted: G W 3.19.0+ analogdevicesinc#805 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 0000000000000001 ffff8800d51e7718 ffffffff81a27457 0000000029e729e6 ffff8800d6108000 ffff8800d51e7748 ffffffff810b539b ffffffff820013dd 00000000000001c8 0000000000000000 ffff8800d7448088 ffff8800d51e7758 Call Trace: [<ffffffff81a27457>] dump_stack+0x4c/0x65 [<ffffffff810b539b>] lockdep_rcu_suspicious+0x107/0x110 [<ffffffff8109796f>] rcu_preempt_sleep_check+0x45/0x47 [<ffffffff8109e457>] ___might_sleep+0x1d/0x1cb [<ffffffff8109e67d>] __might_sleep+0x78/0x80 [<ffffffff814b9b1f>] idr_alloc+0x45/0xd1 [<ffffffff810cb7ab>] ? rcu_read_lock_held+0x3b/0x3d [<ffffffff814b9f9d>] ? idr_for_each+0x53/0x101 [<ffffffff817c1383>] alloc_netid+0x61/0x69 [<ffffffff817c14c3>] __peernet2id+0x79/0x8d [<ffffffff817c1ab7>] peernet2id+0x13/0x1f [<ffffffff817d8673>] rtnl_fill_ifinfo+0xa8d/0xc20 [<ffffffff810b17d9>] ? __lock_is_held+0x39/0x52 [<ffffffff817d894f>] rtnl_dump_ifinfo+0x149/0x213 [<ffffffff8182b9c2>] netlink_dump+0xef/0x26c [<ffffffff8182bcba>] netlink_recvmsg+0x17b/0x2c5 [<ffffffff817b0adc>] __sock_recvmsg+0x4e/0x59 [<ffffffff817b1b40>] sock_recvmsg+0x3f/0x51 [<ffffffff817b1f9a>] ___sys_recvmsg+0xf6/0x1d9 [<ffffffff8115dc67>] ? handle_pte_fault+0x6e1/0xd3d [<ffffffff8100a3a0>] ? native_sched_clock+0x35/0x37 [<ffffffff8109f45b>] ? sched_clock_local+0x12/0x72 [<ffffffff8109f6ac>] ? sched_clock_cpu+0x9e/0xb7 [<ffffffff810cb7ab>] ? rcu_read_lock_held+0x3b/0x3d [<ffffffff811abde8>] ? __fcheck_files+0x4c/0x58 [<ffffffff811ac556>] ? __fget_light+0x2d/0x52 [<ffffffff817b376f>] __sys_recvmsg+0x42/0x60 [<ffffffff817b379f>] SyS_recvmsg+0x12/0x1c Signed-off-by: Eric Dumazet <[email protected]> Fixes: 0c7aecd ("netns: add rtnl cmd to add and get peer netns ids") Cc: Nicolas Dichtel <[email protected]> Reported-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
During system reboot, the sh-dma-engine device may be runtime-suspended, causing a crash: Unhandled fault: imprecise external abort (0x1406) at 0x0002c02c Internal error: : 1406 [parallella#1] SMP ARM ... PC is at sh_dmae_ctl_stop+0x28/0x64 LR is at sh_dmae_ctl_stop+0x24/0x64 If the sh-dma-engine is runtime-suspended, its module clock is turned off, and its registers cannot be accessed. To fix this, move the call to sh_dmae_ctl_stop(), which touches the DMAOR register, to the sh_dmae_suspend() and sh_dmae_runtime_suspend() callbacks. This makes PM operations more symmetric, as both sh_dmae_resume() and sh_dmae_runtime_resume() already call sh_dmae_rst() to re-initialize the DMAOR register. Remove sh_dmae_shutdown(), as it became empty. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Vinod Koul <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
We can have multiple fsync operations against the same file during the same transaction and they can collect the same ordered extents while they don't complete (still accessible from the inode's ordered tree). If this happens, those ordered extents will never get their reference counts decremented to 0, leading to memory leaks and inode leaks (an iput for an ordered extent's inode is scheduled only when the ordered extent's refcount drops to 0). The following sequence diagram explains this race: CPU 1 CPU 2 btrfs_sync_file() btrfs_sync_file() mutex_lock(inode->i_mutex) btrfs_log_inode() btrfs_get_logged_extents() --> collects ordered extent X --> increments ordered extent X's refcount btrfs_submit_logged_extents() mutex_unlock(inode->i_mutex) mutex_lock(inode->i_mutex) btrfs_sync_log() btrfs_wait_logged_extents() --> list_del_init(&ordered->log_list) btrfs_log_inode() btrfs_get_logged_extents() --> Adds ordered extent X to logged_list because at this point: list_empty(&ordered->log_list) && test_bit(BTRFS_ORDERED_LOGGED, &ordered->flags) == 0 --> Increments ordered extent X's refcount --> check if ordered extent's io is finished or not, start it if necessary and wait for it to finish --> sets bit BTRFS_ORDERED_LOGGED on ordered extent X's flags and adds it to trans->ordered btrfs_sync_log() finishes btrfs_submit_logged_extents() btrfs_log_inode() finishes mutex_unlock(inode->i_mutex) btrfs_sync_file() finishes btrfs_sync_log() btrfs_wait_logged_extents() --> Sees ordered extent X has the bit BTRFS_ORDERED_LOGGED set in its flags --> X's refcount is untouched btrfs_sync_log() finishes btrfs_sync_file() finishes btrfs_commit_transaction() --> called by transaction kthread for e.g. btrfs_wait_pending_ordered() --> waits for ordered extent X to complete --> decrements ordered extent X's refcount by 1 only, corresponding to the increment done by the fsync task ran by CPU 1 In the scenario of the above diagram, after the transaction commit, the ordered extent will remain with a refcount of 1 forever, leaking the ordered extent structure and preventing the i_count of its inode from ever decreasing to 0, since the delayed iput is scheduled only when the ordered extent's refcount drops to 0, preventing the inode from ever being evicted by the VFS. Fix this by using the flag BTRFS_ORDERED_LOGGED differently. Use it to mean that an ordered extent is already being processed by an fsync call, which will attach it to the current transaction, preventing it from being collected by subsequent fsync operations against the same inode. This race was introduced with the following change (added in 3.19 and backported to stable 3.18 and 3.17): Btrfs: make sure logged extents complete in the current transaction V3 commit 50d9aa9 I ran into this issue while running xfstests/generic/113 in a loop, which failed about 1 out of 10 runs with the following warning in dmesg: [ 2612.440038] WARNING: CPU: 4 PID: 22057 at fs/btrfs/disk-io.c:3558 free_fs_root+0x36/0x133 [btrfs]() [ 2612.442810] Modules linked in: btrfs crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop processor parport_pc parport psmouse therma l_sys i2c_piix4 serio_raw pcspkr evdev microcode button i2c_core ext4 crc16 jbd2 mbcache sd_mod sg sr_mod cdrom virtio_scsi ata_generic virtio_pci ata_piix virtio_ring libata virtio flo ppy e1000 scsi_mod [last unloaded: btrfs] [ 2612.452711] CPU: 4 PID: 22057 Comm: umount Tainted: G W 3.19.0-rc5-btrfs-next-4+ parallella#1 [ 2612.454921] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 2612.457709] 0000000000000009 ffff8801342c3c78 ffffffff8142425e ffff88023ec8f2d8 [ 2612.459829] 0000000000000000 ffff8801342c3cb8 ffffffff81045308 ffff880046460000 [ 2612.461564] ffffffffa036da56 ffff88003d07b000 ffff880046460000 ffff880046460068 [ 2612.463163] Call Trace: [ 2612.463719] [<ffffffff8142425e>] dump_stack+0x4c/0x65 [ 2612.464789] [<ffffffff81045308>] warn_slowpath_common+0xa1/0xbb [ 2612.466026] [<ffffffffa036da56>] ? free_fs_root+0x36/0x133 [btrfs] [ 2612.467247] [<ffffffff810453c5>] warn_slowpath_null+0x1a/0x1c [ 2612.468416] [<ffffffffa036da56>] free_fs_root+0x36/0x133 [btrfs] [ 2612.469625] [<ffffffffa036f2a7>] btrfs_drop_and_free_fs_root+0x93/0x9b [btrfs] [ 2612.471251] [<ffffffffa036f353>] btrfs_free_fs_roots+0xa4/0xd6 [btrfs] [ 2612.472536] [<ffffffff8142612e>] ? wait_for_completion+0x24/0x26 [ 2612.473742] [<ffffffffa0370bbc>] close_ctree+0x1f3/0x33c [btrfs] [ 2612.475477] [<ffffffff81059d1d>] ? destroy_workqueue+0x148/0x1ba [ 2612.476695] [<ffffffffa034e3da>] btrfs_put_super+0x19/0x1b [btrfs] [ 2612.477911] [<ffffffff81153e53>] generic_shutdown_super+0x73/0xef [ 2612.479106] [<ffffffff811540e2>] kill_anon_super+0x13/0x1e [ 2612.480226] [<ffffffffa034e1e3>] btrfs_kill_super+0x17/0x23 [btrfs] [ 2612.481471] [<ffffffff81154307>] deactivate_locked_super+0x3b/0x50 [ 2612.482686] [<ffffffff811547a7>] deactivate_super+0x3f/0x43 [ 2612.483791] [<ffffffff8116b3ed>] cleanup_mnt+0x59/0x78 [ 2612.484842] [<ffffffff8116b44c>] __cleanup_mnt+0x12/0x14 [ 2612.485900] [<ffffffff8105d019>] task_work_run+0x8f/0xbc [ 2612.486960] [<ffffffff810028d8>] do_notify_resume+0x5a/0x6b [ 2612.488083] [<ffffffff81236e5b>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 2612.489333] [<ffffffff8142a17f>] int_signal+0x12/0x17 [ 2612.490353] ---[ end trace 54a960a6bdcb8d93 ]--- [ 2612.557253] VFS: Busy inodes after unmount of sdb. Self-destruct in 5 seconds. Have a nice day... Kmemleak confirmed the ordered extent leak (and btrfs inode specific structures such as delayed nodes): $ cat /sys/kernel/debug/kmemleak unreferenced object 0xffff880154290db0 (size 576): comm "btrfsck", pid 21980, jiffies 4295542503 (age 1273.412s) hex dump (first 32 bytes): 01 40 00 00 01 00 00 00 b0 1d f1 4e 01 88 ff ff [email protected].... 00 00 00 00 00 00 00 00 c8 0d 29 54 01 88 ff ff ..........)T.... backtrace: [<ffffffff8141d74d>] kmemleak_update_trace+0x4c/0x6a [<ffffffff8122f2c0>] radix_tree_node_alloc+0x6d/0x83 [<ffffffff8122fb26>] __radix_tree_create+0x109/0x190 [<ffffffff8122fbdd>] radix_tree_insert+0x30/0xac [<ffffffffa03b9bde>] btrfs_get_or_create_delayed_node+0x130/0x187 [btrfs] [<ffffffffa03bb82d>] btrfs_delayed_delete_inode_ref+0x32/0xac [btrfs] [<ffffffffa0379dae>] __btrfs_unlink_inode+0xee/0x288 [btrfs] [<ffffffffa037c715>] btrfs_unlink_inode+0x1e/0x40 [btrfs] [<ffffffffa037c797>] btrfs_unlink+0x60/0x9b [btrfs] [<ffffffff8115d7f0>] vfs_unlink+0x9c/0xed [<ffffffff8115f5de>] do_unlinkat+0x12c/0x1fa [<ffffffff811601a7>] SyS_unlinkat+0x29/0x2b [<ffffffff81429e92>] system_call_fastpath+0x12/0x17 [<ffffffffffffffff>] 0xffffffffffffffff unreferenced object 0xffff88014ef11db0 (size 576): comm "rm", pid 22009, jiffies 4295542593 (age 1273.052s) hex dump (first 32 bytes): 02 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 c8 1d f1 4e 01 88 ff ff ...........N.... backtrace: [<ffffffff8141d74d>] kmemleak_update_trace+0x4c/0x6a [<ffffffff8122f2c0>] radix_tree_node_alloc+0x6d/0x83 [<ffffffff8122fb26>] __radix_tree_create+0x109/0x190 [<ffffffff8122fbdd>] radix_tree_insert+0x30/0xac [<ffffffffa03b9bde>] btrfs_get_or_create_delayed_node+0x130/0x187 [btrfs] [<ffffffffa03bb82d>] btrfs_delayed_delete_inode_ref+0x32/0xac [btrfs] [<ffffffffa0379dae>] __btrfs_unlink_inode+0xee/0x288 [btrfs] [<ffffffffa037c715>] btrfs_unlink_inode+0x1e/0x40 [btrfs] [<ffffffffa037c797>] btrfs_unlink+0x60/0x9b [btrfs] [<ffffffff8115d7f0>] vfs_unlink+0x9c/0xed [<ffffffff8115f5de>] do_unlinkat+0x12c/0x1fa [<ffffffff811601a7>] SyS_unlinkat+0x29/0x2b [<ffffffff81429e92>] system_call_fastpath+0x12/0x17 [<ffffffffffffffff>] 0xffffffffffffffff unreferenced object 0xffff8800336feda8 (size 584): comm "aio-stress", pid 22031, jiffies 4295543006 (age 1271.400s) hex dump (first 32 bytes): 00 40 3e 00 00 00 00 00 00 00 8f 42 00 00 00 00 .@>........B.... 00 00 01 00 00 00 00 00 00 00 01 00 00 00 00 00 ................ backtrace: [<ffffffff8114eb34>] create_object+0x172/0x29a [<ffffffff8141d790>] kmemleak_alloc+0x25/0x41 [<ffffffff81141ae6>] kmemleak_alloc_recursive.constprop.52+0x16/0x18 [<ffffffff81145288>] kmem_cache_alloc+0xf7/0x198 [<ffffffffa0389243>] __btrfs_add_ordered_extent+0x43/0x309 [btrfs] [<ffffffffa038968b>] btrfs_add_ordered_extent_dio+0x12/0x14 [btrfs] [<ffffffffa03810e2>] btrfs_get_blocks_direct+0x3ef/0x571 [btrfs] [<ffffffff81181349>] do_blockdev_direct_IO+0x62a/0xb47 [<ffffffff8118189a>] __blockdev_direct_IO+0x34/0x36 [<ffffffffa03776e5>] btrfs_direct_IO+0x16a/0x1e8 [btrfs] [<ffffffff81100373>] generic_file_direct_write+0xb8/0x12d [<ffffffffa038615c>] btrfs_file_write_iter+0x24b/0x42f [btrfs] [<ffffffff8118bb0d>] aio_run_iocb+0x2b7/0x32e [<ffffffff8118c99a>] do_io_submit+0x26e/0x2ff [<ffffffff8118ca3b>] SyS_io_submit+0x10/0x12 [<ffffffff81429e92>] system_call_fastpath+0x12/0x17 CC: <[email protected]> # 3.19, 3.18 and 3.17 Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: Chris Mason <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
Passing zeroed drm_radeon_cs struct to DRM_IOCTL_RADEON_CS produces the following oops. Fix by always calling INIT_LIST_HEAD() to avoid the crash in list_sort(). ---------------------------------- #include <stdint.h> #include <fcntl.h> #include <unistd.h> #include <sys/ioctl.h> #include <drm/radeon_drm.h> static const struct drm_radeon_cs cs; int main(int argc, char **argv) { return ioctl(open(argv[1], O_RDWR), DRM_IOCTL_RADEON_CS, &cs); } ---------------------------------- [ttrantal@test2 ~]$ ./main /dev/dri/card0 [ 46.904650] BUG: unable to handle kernel NULL pointer dereference at (null) [ 46.905022] IP: [<ffffffff814d6df2>] list_sort+0x42/0x240 [ 46.905022] PGD 68f29067 PUD 688b5067 PMD 0 [ 46.905022] Oops: 0002 [parallella#1] SMP [ 46.905022] CPU: 0 PID: 2413 Comm: main Not tainted 4.0.0-rc1+ analogdevicesinc#58 [ 46.905022] Hardware name: Hewlett-Packard HP Compaq dc5750 Small Form Factor/0A64h, BIOS 786E3 v02.10 01/25/2007 [ 46.905022] task: ffff880058e2bcc0 ti: ffff880058e64000 task.ti: ffff880058e64000 [ 46.905022] RIP: 0010:[<ffffffff814d6df2>] [<ffffffff814d6df2>] list_sort+0x42/0x240 [ 46.905022] RSP: 0018:ffff880058e67998 EFLAGS: 00010246 [ 46.905022] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 46.905022] RDX: ffffffff81644410 RSI: ffff880058e67b40 RDI: ffff880058e67a58 [ 46.905022] RBP: ffff880058e67a88 R08: 0000000000000000 R09: 0000000000000000 [ 46.905022] R10: ffff880058e2bcc0 R11: ffffffff828e6ca0 R12: ffffffff81644410 [ 46.905022] R13: ffff8800694b8018 R14: 0000000000000000 R15: ffff880058e679b0 [ 46.905022] FS: 00007fdc65a65700(0000) GS:ffff88006d600000(0000) knlGS:0000000000000000 [ 46.905022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 46.905022] CR2: 0000000000000000 CR3: 0000000058dd9000 CR4: 00000000000006f0 [ 46.905022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 46.905022] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400 [ 46.905022] Stack: [ 46.905022] ffff880058e67b40 ffff880058e2bcc0 ffff880058e67a78 0000000000000000 [ 46.905022] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 46.905022] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 46.905022] Call Trace: [ 46.905022] [<ffffffff81644a65>] radeon_cs_parser_fini+0x195/0x220 [ 46.905022] [<ffffffff81645069>] radeon_cs_ioctl+0xa9/0x960 [ 46.905022] [<ffffffff815e1f7c>] drm_ioctl+0x19c/0x640 [ 46.905022] [<ffffffff810f8fdd>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [ 46.905022] [<ffffffff810f90ad>] ? trace_hardirqs_on+0xd/0x10 [ 46.905022] [<ffffffff8160c066>] radeon_drm_ioctl+0x46/0x80 [ 46.905022] [<ffffffff81211868>] do_vfs_ioctl+0x318/0x570 [ 46.905022] [<ffffffff81462ef6>] ? selinux_file_ioctl+0x56/0x110 [ 46.905022] [<ffffffff81211b41>] SyS_ioctl+0x81/0xa0 [ 46.905022] [<ffffffff81dc6312>] system_call_fastpath+0x12/0x17 [ 46.905022] Code: 48 89 b5 10 ff ff ff 0f 84 03 01 00 00 4c 8d bd 28 ff ff ff 31 c0 48 89 fb b9 15 00 00 00 49 89 d4 4c 89 ff f3 48 ab 48 8b 46 08 <48> c7 00 00 00 00 00 48 8b 0e 48 85 c9 0f 84 7d 00 00 00 c7 85 [ 46.905022] RIP [<ffffffff814d6df2>] list_sort+0x42/0x240 [ 46.905022] RSP <ffff880058e67998> [ 46.905022] CR2: 0000000000000000 [ 47.149253] ---[ end trace 09576b4e8b2c20b8 ]--- Reviewed-by: Christian König <[email protected]> Signed-off-by: Tommi Rantala <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
Under z/VM PQAP might trigger an operation exception if no crypto cards are defined via APVIRTUAL or APDEDICATED. [ 386.098666] Kernel BUG at 0000000000135c56 [verbose debug info unavailable] [ 386.098693] illegal operation: 0001 ilc:2 [parallella#1] SMP [...] [ 386.098751] Krnl PSW : 0704c00180000000 0000000000135c56 (kvm_s390_apxa_installed+0x46/0x98) [...] [ 386.098804] [<000000000013627c>] kvm_arch_init_vm+0x29c/0x358 [ 386.098806] [<000000000012d008>] kvm_dev_ioctl+0xc0/0x460 [ 386.098809] [<00000000002c639a>] do_vfs_ioctl+0x332/0x508 [ 386.098811] [<00000000002c660e>] SyS_ioctl+0x9e/0xb0 [ 386.098814] [<000000000070476a>] system_call+0xd6/0x258 [ 386.098815] [<000003fffc7400a2>] 0x3fffc7400a2 Lets add an extable entry and provide a zeroed config in that case. Reported-by: Stefan Zimmermann <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]> Reviewed-by: Thomas Huth <[email protected]> Tested-by: Stefan Zimmermann <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
When do suspend/resume stress test, some log shows "rcv is not +last". The issue is that enet suspend will disable phy clock, phy link down, after resume back, enet MAC redo initial and ready to tx/rx packet, but phy still is not ready which is doing auto-negotiation. When phy link is not up, don't schdule napi soft irq. [Peter] It has fixed kernel panic after long time suspend/resume test with nfs rootfs. [ 8864.429458] fec 2188000.ethernet eth0: rcv is not +last [ 8864.434799] fec 2188000.ethernet eth0: rcv is not +last [ 8864.440088] fec 2188000.ethernet eth0: rcv is not +last [ 8864.445424] fec 2188000.ethernet eth0: rcv is not +last [ 8864.450782] fec 2188000.ethernet eth0: rcv is not +last [ 8864.456111] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 8864.464225] pgd = 80004000 [ 8864.466997] [00000000] *pgd=00000000 [ 8864.470627] Internal error: Oops: 17 [parallella#1] SMP ARM [ 8864.475353] Modules linked in: evbug [ 8864.479006] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.0.0-rc1-00044-g7a2a1d2 analogdevicesinc#234 [ 8864.486854] Hardware name: Freescale i.MX6 SoloX (Device Tree) [ 8864.492709] task: be069380 ti: be07a000 task.ti: be07a000 [ 8864.498137] PC is at memcpy+0x80/0x330 [ 8864.501919] LR is at gro_pull_from_frag0+0x34/0xa8 [ 8864.506735] pc : [<802bb080>] lr : [<8057c204>] psr: 00000113 [ 8864.506735] sp : be07bbd4 ip : 00000010 fp : be07bc0c [ 8864.518235] r10: 0000000e r9 : 00000000 r8 : 809c7754 [ 8864.523479] r7 : 809c7754 r6 : bb43c040 r5 : bd280cc0 r4 : 00000012 [ 8864.530025] r3 : 00000804 r2 : fffffff2 r1 : 00000000 r0 : bb43b83c [ 8864.536575] Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel [ 8864.543904] Control: 10c5387d Table: bd14c04a DAC: 00000015 [ 8864.549669] Process ksoftirqd/0 (pid: 3, stack limit = 0xbe07a210) [ 8864.555869] Stack: (0xbe07bbd4 to 0xbe07c000) [ 8864.560250] bbc0: bd280cc0 bb43c040 809c7754 [ 8864.568455] bbe0: 809c7754 bb43b83c 00000012 8057c204 00000000 bd280cc0 bd8a0718 00000003 [ 8864.576658] bc00: be07bc5c be07bc10 8057ebf0 8057c1dc 00000000 00000000 8057ecc4 bef59760 [ 8864.584863] bc20: 00000002 bd8a0000 be07bc64 809c7754 00000000 bd8a0718 bd280cc0 bd8a0000 [ 8864.593066] bc40: 00000000 0000001c 00000000 bd8a0000 be07bc74 be07bc60 8057f148 8057eb90 [ 8864.601268] bc60: bf0810a0 00000000 be07bcf4 be07bc78 8044e7b4 8057f12c 00000000 8007df6c [ 8864.609470] bc80: bd8a0718 00000040 00000000 bd280a80 00000002 00000019 bd8a0600 bd8a1214 [ 8864.617672] bca0: bd8a0690 bf0810a0 00000000 00000000 bd8a1000 00000000 00000027 bd280cc0 [ 8864.625874] bcc0: 80062708 800625cc 000943db bd8a0718 00000001 000d1166 00000040 be7c1ec0 [ 8864.634077] bce0: 0000012c be07bd00 be07bd3c be07bcf8 8057fc98 8044e3ac 809c2ec0 3ddff000 [ 8864.642280] bd00: be07bd00 be07bd00 be07bd08 be07bd08 00000000 00000020 809c608c 00000003 [ 8864.650481] bd20: 809c6080 40000001 809c6088 00200100 be07bd84 be07bd40 8002e690 8057fac8 [ 8864.658684] bd40: be07bd64 be07bd50 00000001 04208040 000d1165 0000000a be07bd84 809c0d7c [ 8864.666885] bd60: 00000000 809c6af8 00000000 00000001 be008000 00000000 be07bd9c be07bd88 [ 8864.675087] bd80: 8002eb64 8002e564 00000125 809c0d7c be07bdc4 be07bda0 8006f100 8002eaac [ 8864.683291] bda0: c080e10c be07bde8 809c6c6c c080e100 00000002 00000000 be07bde4 be07bdc8 [ 8864.691492] bdc0: 800087a0 8006f098 806f2934 20000013 ffffffff be07be1c be07be44 be07bde8 [ 8864.699695] bde0: 800133a4 80008784 00000001 00000001 00000000 00000000 be7c1680 00000000 [ 8864.707896] be00: be0cfe00 bd93eb40 00000002 00000000 00000000 be07be44 be07be00 be07be30 [ 8864.716098] be20: 8006278c 806f2934 20000013 ffffffff be069380 be7c1680 be07be7c be07be48 [ 8864.724300] be40: 80049cfc 806f2910 00000001 00000000 80049cb4 00000000 be07be7c be7c1680 [ 8864.732502] be60: be3289c0 be069380 bd23b600 be0cfe00 be07bebc be07be80 806ed614 80049c68 [ 8864.740706] be80: be07a000 0000020a 809c608c 00000003 00000001 8002e858 be07a000 be035740 [ 8864.748907] bea0: 00000000 00000001 809d4598 00000000 be07bed4 be07bec0 806edd0c 806ed440 [ 8864.757110] bec0: be07a000 be07a000 be07bee4 be07bed8 806edd68 806edcf0 be07bef4 be07bee8 [ 8864.765311] bee0: 8002e860 806edd34 be07bf24 be07bef8 800494b0 8002e828 be069380 00000000 [ 8864.773512] bf00: be035780 be035740 8004938c 00000000 00000000 00000000 be07bfac be07bf28 [ 8864.781715] bf20: 80045928 80049398 be07bf44 00000001 00000000 be035740 00000000 00030003 [ 8864.789917] bf40: dead4ead ffffffff ffffffff 80a2716c 80b59b00 00000000 8088c954 be07bf5c [ 8864.798120] bf60: be07bf5c 00000000 00000000 dead4ead ffffffff ffffffff 80a2716c 00000000 [ 8864.806320] bf80: 00000000 8088c954 be07bf88 be07bf88 be035780 8004584c 00000000 00000000 [ 8864.814523] bfa0: 00000000 be07bfb0 8000ed10 80045858 00000000 00000000 00000000 00000000 [ 8864.822723] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 8864.830925] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 5ffbb5f7 f9fcf5e7 [ 8864.839115] Backtrace: [ 8864.841631] [<8057c1d0>] (gro_pull_from_frag0) from [<8057ebf0>] (dev_gro_receive+0x6c/0x3f8) [ 8864.850173] r6:00000003 r5:bd8a0718 r4:bd280cc0 r3:00000000 [ 8864.855958] [<8057eb84>] (dev_gro_receive) from [<8057f148>] (napi_gro_receive+0x28/0xac) [ 8864.864152] r10:bd8a0000 r9:00000000 r8:0000001c r7:00000000 r6:bd8a0000 r5:bd280cc0 [ 8864.872115] r4:bd8a0718 [ 8864.874713] [<8057f120>] (napi_gro_receive) from [<8044e7b4>] (fec_enet_rx_napi+0x414/0xc74) [ 8864.883167] r5:00000000 r4:bf0810a0 [ 8864.886823] [<8044e3a0>] (fec_enet_rx_napi) from [<8057fc98>] (net_rx_action+0x1dc/0x2ec) [ 8864.895016] r10:be07bd00 r9:0000012c r8:be7c1ec0 r7:00000040 r6:000d1166 r5:00000001 [ 8864.902982] r4:bd8a0718 [ 8864.905570] [<8057fabc>] (net_rx_action) from [<8002e690>] (__do_softirq+0x138/0x2c4) [ 8864.913417] r10:00200100 r9:809c6088 r8:40000001 r7:809c6080 r6:00000003 r5:809c608c [ 8864.921382] r4:00000020 [ 8864.923966] [<8002e558>] (__do_softirq) from [<8002eb64>] (irq_exit+0xc4/0x138) [ 8864.931289] r10:00000000 r9:be008000 r8:00000001 r7:00000000 r6:809c6af8 r5:00000000 [ 8864.939252] r4:809c0d7c [ 8864.941841] [<8002eaa0>] (irq_exit) from [<8006f100>] (__handle_domain_irq+0x74/0xe8) [ 8864.949688] r4:809c0d7c r3:00000125 [ 8864.953342] [<8006f08c>] (__handle_domain_irq) from [<800087a0>] (gic_handle_irq+0x28/0x68) [ 8864.961707] r9:00000000 r8:00000002 r7:c080e100 r6:809c6c6c r5:be07bde8 r4:c080e10c [ 8864.969597] [<80008778>] (gic_handle_irq) from [<800133a4>] (__irq_svc+0x44/0x5c) [ 8864.977097] Exception stack(0xbe07bde8 to 0xbe07be30) [ 8864.982173] bde0: 00000001 00000001 00000000 00000000 be7c1680 00000000 [ 8864.990377] be00: be0cfe00 bd93eb40 00000002 00000000 00000000 be07be44 be07be00 be07be30 [ 8864.998573] be20: 8006278c 806f2934 20000013 ffffffff [ 8865.003638] r7:be07be1c r6:ffffffff r5:20000013 r4:806f2934 [ 8865.009447] [<806f2904>] (_raw_spin_unlock_irq) from [<80049cfc>] (finish_task_switch+0xa0/0x160) [ 8865.018334] r4:be7c1680 r3:be069380 [ 8865.021993] [<80049c5c>] (finish_task_switch) from [<806ed614>] (__schedule+0x1e0/0x5dc) [ 8865.030098] r8:be0cfe00 r7:bd23b600 r6:be069380 r5:be3289c0 r4:be7c1680 [ 8865.036942] [<806ed434>] (__schedule) from [<806edd0c>] (preempt_schedule_common+0x28/0x44) [ 8865.045307] r9:00000000 r8:809d4598 r7:00000001 r6:00000000 r5:be035740 r4:be07a000 [ 8865.053197] [<806edce4>] (preempt_schedule_common) from [<806edd68>] (_cond_resched+0x40/0x48) [ 8865.061822] r4:be07a000 r3:be07a000 [ 8865.065472] [<806edd28>] (_cond_resched) from [<8002e860>] (run_ksoftirqd+0x44/0x64) [ 8865.073252] [<8002e81c>] (run_ksoftirqd) from [<800494b0>] (smpboot_thread_fn+0x124/0x190) [ 8865.081550] [<8004938c>] (smpboot_thread_fn) from [<80045928>] (kthread+0xdc/0xf8) [ 8865.089133] r10:00000000 r9:00000000 r8:00000000 r7:8004938c r6:be035740 r5:be035780 [ 8865.097097] r4:00000000 r3:be069380 [ 8865.100752] [<8004584c>] (kthread) from [<8000ed10>] (ret_from_fork+0x14/0x24) [ 8865.107990] r7:00000000 r6:00000000 r5:8004584c r4:be035780 [ 8865.113767] Code: e320f000 e4913004 e4914004 e4915004 (e4916004) [ 8865.120006] ---[ end trace b0a4c6bd499288ca ]--- [ 8865.124697] Kernel panic - not syncing: Fatal exception in interrupt [ 8865.131084] ---[ end Kernel panic - not syncing: Fatal exception in interrupt Cc: [v3.19+] [email protected] Tested-by: Peter Chen <[email protected]> Signed-off-by: Peter Chen <[email protected]> Signed-off-by: Fugang Duan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
The a80 optimus has 8 CPUs. I propose we increase the maximum number of CPUs to 8 to avoid the following warning identified during automated boot testing [1]. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at ../arch/arm/kernel/devtree.c:144 arm_dt_init_cpu_maps+0x110/0x1e0() DT /cpu 5 nodes greater than max cores 4, capping them CPU: 0 PID: 0 Comm: swapper Not tainted 3.19.0-00528-gbdccc4edeb03 parallella#1 Hardware name: Allwinner sun9i Family [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [] (show_stack) from [] (dump_stack+0x74/0x90) [] (dump_stack) from [] (warn_slowpath_common+0x70/0xac) [] (warn_slowpath_common) from [] (warn_slowpath_fmt+0x30/0x40) [] (warn_slowpath_fmt) from [] (arm_dt_init_cpu_maps+0x110/0x1e0) [] (arm_dt_init_cpu_maps) from [] (setup_arch+0x634/0x8d4) [] (setup_arch) from [] (start_kernel+0x88/0x3ac) [] (start_kernel) from [<20008074>] (0x20008074) ---[ end trace cb88537fdc8fa200 ]--- [1] http://storage.kernelci.org/mainline/v3.19-528-gbdccc4edeb03/arm-sunxi_defconfig/lab-tbaker/boot-sun9i-a80-optimus.html Cc: Maxime Ripard <[email protected]> Cc: Olof Johansson <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Arnd Bergmann <[email protected]> Signed-off-by: Tyler Baker <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
…s to 16 The HiSilicon HiP04 has 16 CPUs. I propose we increase the maximum number of CPUs to 16 to avoid the following warning identified during automated boot testing [1]. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at ../arch/arm/kernel/devtree.c:144 arm_dt_init_cpu_maps+0x118/0x1e8() DT /cpu 9 nodes greater than max cores 8, capping them Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 3.19.0-00528-gbdccc4edeb03 parallella#1 Hardware name: Hisilicon HiP04 (Flattened Device Tree) [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [] (show_stack) from [] (dump_stack+0x78/0x94) [] (dump_stack) from [] (warn_slowpath_common+0x74/0xb0) [] (warn_slowpath_common) from [] (warn_slowpath_fmt+0x30/0x40) [] (warn_slowpath_fmt) from [] (arm_dt_init_cpu_maps+0x118/0x1e8) [] (arm_dt_init_cpu_maps) from [] (setup_arch+0x638/0x9a0) [] (setup_arch) from [] (start_kernel+0x8c/0x3b4) [] (start_kernel) from [<10208074>] (0x10208074) ---[ end trace cb88537fdc8fa200 ]--- [1] http://storage.kernelci.org/mainline/v3.19-528-gbdccc4edeb03/arm-multi_v7_defconfig/lab-tbaker/boot-hip04-d01.html Cc: Olof Johansson <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Arnd Bergmann <[email protected]> Signed-off-by: Tyler Baker <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
The commit below introduced an unsafe dereference of mvmvif->phy_ctxt. It can be NULL even if we hold the mutex. We can be handling a BT Coex notification while the vif has already been unassigned. This can happen since the BT Coex notification is hanled asynchronuously: we can have started to handle the BT Coex notification trying to acquire the mutex while the unassign flow already got it. The BT Coex notification handling will wait for the mutext. I'll get it later, but then mvmvif->phy_ctxt will be NULL. Panic log: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<f985180d>] iwl_mvm_bt_notif_iterator+0x9d/0x340 [iwlmvm] *pdpt = 0000000000000000 *pde = f000eef300000007 Oops: 0000 [parallella#1] SMP Workqueue: events iwl_mvm_async_handlers_wk [iwlmvm] task: ed719b20 ti: ec03e000 task.ti: ec03e000 EIP: 0060:[<f985180d>] EFLAGS: 00010202 CPU: 2 EIP is at iwl_mvm_bt_notif_iterator+0x9d/0x340 [iwlmvm] EAX: 00000000 EBX: f6d3cb70 ECX: f6d3cb70 EDX: 00000000 ESI: ec03fe40 EDI: efeb8810 EBP: ec03fdf0 ESP: ec03fdac DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 CR0: 80050033 CR2: 00000000 CR3: 01a1a000 CR4: 001407f0 Stack: f743ca80 f744a404 ec03fdcc c10e3952 00003aba f743ca80 00000246 f743ca80 00000246 00000000 00000001 00000000 ebd45ff6 ebd458a4 f6d3c500 ebd45578 ebd44b01 ec03fe18 f99e1bc2 00000002 ebd44bc0 f9851770 00000000 f6d3c500 Call Trace: [<c10e3952>] ? ring_buffer_unlock_commit+0xa2/0xd0 [<f99e1bc2>] __iterate_interfaces+0x82/0x110 [mac80211] [<f9851770>] ? iwl_mvm_bt_coex_reduced_txp+0x140/0x140 [iwlmvm] [<f99e1c6a>] ieee80211_iterate_active_interfaces_atomic+0x1a/0x20 [mac80211] [<f9851427>] iwl_mvm_bt_coex_notif_handle+0x77/0x280 [iwlmvm] [<f9852161>] iwl_mvm_rx_bt_coex_notif_old+0x211/0x220 [iwlmvm] [<f9850b8b>] iwl_mvm_rx_bt_coex_notif+0x19b/0x1b0 [iwlmvm] [<f983944f>] iwl_mvm_async_handlers_wk+0x7f/0xe0 [iwlmvm] CC: <[email protected]> [3.19+] Fixes: 123f515 ("iwlwifi: mvm: BT Coex - add support for TTC / RRC") Signed-off-by: Emmanuel Grumbach <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
Commit de7b5b3 ("net: eth: xgene: change APM X-Gene SoC platform ethernet to support ACPI") breaks booting with devicetree with UEFI firmware. In that case, I get: Unhandled fault: synchronous external abort (0x96000010) at 0xfffffc0000620010 Internal error: : 96000010 [parallella#1] SMP Modules linked in: vfat fat xfs libcrc32c ahci_xgene libahci_platform libahci CPU: 7 PID: 634 Comm: NetworkManager Not tainted 4.0.0-rc1+ parallella#4 Hardware name: AppliedMicro Mustang/Mustang, BIOS 1.1.0-rh-0.14 Mar 1 2015 task: fffffe03d4c7e100 ti: fffffe03d4e24000 task.ti: fffffe03d4e24000 PC is at xgene_enet_rd_mcx_mac.isra.11+0x58/0xd4 LR is at xgene_gmac_tx_enable+0x2c/0x50 pc : [<fffffe000069d6fc>] lr : [<fffffe000069dcc4>] pstate: 80000145 sp : fffffe03d4e27590 x29: fffffe03d4e27590 x28: 0000000000000000 x27: fffffe03d4e277c0 x26: fffffe03da8fda10 x25: fffffe03d4e2760c x24: fffffe03d49e28c0 x23: fffffc0000620004 x22: 0000000000000000 x21: fffffc0000620000 x20: fffffc0000620010 x19: 000000000000000b x18: 000003ffd4a96020 x17: 000003ff7fc1f7a0 x16: fffffe000079b9cc x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: fffffe03d4e24000 x11: fffffe03d4e27da0 x10: 0000000000000001 x9 : 0000000000000000 x8 : fffffe03d4e27a20 x7 : 0000000000000000 x6 : 00000000ffffffef x5 : fffffe000105f7d0 x4 : fffffe00007ca8c8 x3 : fffffe03d4e2760c x2 : 0000000000000000 x1 : fffffc0000620000 x0 : 0000000040000000 Process NetworkManager (pid: 634, stack limit = 0xfffffe03d4e24028) Stack: (0xfffffe03d4e27590 to 0xfffffe03d4e28000) ... Call trace: [<fffffe000069d6fc>] xgene_enet_rd_mcx_mac.isra.11+0x58/0xd4 [<fffffe000069dcc0>] xgene_gmac_tx_enable+0x28/0x50 [<fffffe00006a112c>] xgene_enet_open+0x2c/0x130 [<fffffe00007b9254>] __dev_open+0xc8/0x148 [<fffffe00007b956c>] __dev_change_flags+0x90/0x158 [<fffffe00007b9664>] dev_change_flags+0x30/0x70 [<fffffe00007c8ab8>] do_setlink+0x278/0x870 [<fffffe00007c95bc>] rtnl_newlink+0x404/0x6a8 [<fffffe00007c8040>] rtnetlink_rcv_msg+0x98/0x218 [<fffffe00007e78e4>] netlink_rcv_skb+0xe0/0xf8 [<fffffe00007c7f94>] rtnetlink_rcv+0x30/0x44 [<fffffe00007e6f2c>] netlink_unicast+0xfc/0x210 [<fffffe00007e75b8>] netlink_sendmsg+0x498/0x5ac [<fffffe00007990b8>] do_sock_sendmsg+0xa4/0xcc [<fffffe000079a958>] ___sys_sendmsg+0x1fc/0x208 [<fffffe000079b984>] __sys_sendmsg+0x4c/0x94 [<fffffe000079b9f8>] SyS_sendmsg+0x2c/0x3c The problem here is that the enet hw clocks are not getting initialized because of a test to avoid the initialization if UEFI is used to boot. This is an incorrect test. When booting with UEFI and devicetree, the kernel must still initialize the enet hw clocks. If booting with ACPI, the clock hw is not exposed to the kernel and it is that case where we want to avoid initializing clocks. Signed-off-by: Mark Salter <[email protected]> Acked-by: Feng Kan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
This crash was reported: [ 366.947370] sd 3:0:1:0: [sdb] Spinning up disk.... [ 368.804046] BUG: unable to handle kernel NULL pointer dereference at (null) [ 368.804072] IP: [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b [ 368.804098] PGD 0 [ 368.804114] Oops: 0002 [parallella#1] SMP [ 368.804143] CPU 1 [ 368.804151] Modules linked in: sg netconsole s3g(PO) uinput joydev hid_multitouch usbhid hid snd_hda_codec_via cpufreq_userspace cpufreq_powersave cpufreq_stats uhci_hcd cpufreq_conservative snd_hda_intel snd_hda_codec snd_hwdep snd_pcm sdhci_pci snd_page_alloc sdhci snd_timer snd psmouse evdev serio_raw pcspkr soundcore xhci_hcd shpchp s3g_drm(O) mvsas mmc_core ahci libahci drm i2c_core acpi_cpufreq mperf video processor button thermal_sys dm_dmirror exfat_fs exfat_core dm_zcache dm_mod padlock_aes aes_generic padlock_sha iscsi_target_mod target_core_mod configfs sswipe libsas libata scsi_transport_sas picdev via_cputemp hwmon_vid fuse parport_pc ppdev lp parport autofs4 ext4 crc16 mbcache jbd2 sd_mod crc_t10dif usb_storage scsi_mod ehci_hcd usbcore usb_common [ 368.804749] [ 368.804764] Pid: 392, comm: kworker/u:3 Tainted: P W O 3.4.87-logicube-ng.22 parallella#1 To be filled by O.E.M. To be filled by O.E.M./EPIA-M920 [ 368.804802] RIP: 0010:[<ffffffff81358457>] [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b [ 368.804827] RSP: 0018:ffff880117001cc0 EFLAGS: 00010246 [ 368.804842] RAX: 0000000000000000 RBX: ffff8801185030d0 RCX: ffff88008edcb420 [ 368.804857] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff8801185030d4 [ 368.804873] RBP: ffff8801181531c0 R08: 0000000000000020 R09: 00000000fffffffe [ 368.804885] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801185030d4 [ 368.804899] R13: 0000000000000002 R14: ffff880117001fd8 R15: ffff8801185030d8 [ 368.804916] FS: 0000000000000000(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000 [ 368.804931] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 368.804946] CR2: 0000000000000000 CR3: 000000000160b000 CR4: 00000000000006e0 [ 368.804962] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 368.804978] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 368.804995] Process kworker/u:3 (pid: 392, threadinfo ffff880117000000, task ffff8801181531c0) [ 368.805009] Stack: [ 368.805017] ffff8801185030d8 0000000000000000 ffffffff8161ddf0 ffffffff81056f7c [ 368.805062] 000000000000b503 ffff8801185030d0 ffff880118503000 0000000000000000 [ 368.805100] ffff8801185030d0 ffff8801188b8000 ffff88008edcb420 ffffffff813583ac [ 368.805135] Call Trace: [ 368.805153] [<ffffffff81056f7c>] ? up+0xb/0x33 [ 368.805168] [<ffffffff813583ac>] ? mutex_lock+0x16/0x25 [ 368.805194] [<ffffffffa018c414>] ? smp_execute_task+0x4e/0x222 [libsas] [ 368.805217] [<ffffffffa018ce1c>] ? sas_find_bcast_dev+0x3c/0x15d [libsas] [ 368.805240] [<ffffffffa018ce4f>] ? sas_find_bcast_dev+0x6f/0x15d [libsas] [ 368.805264] [<ffffffffa018e989>] ? sas_ex_revalidate_domain+0x37/0x2ec [libsas] [ 368.805280] [<ffffffff81355a2a>] ? printk+0x43/0x48 [ 368.805296] [<ffffffff81359a65>] ? _raw_spin_unlock_irqrestore+0xc/0xd [ 368.805318] [<ffffffffa018b767>] ? sas_revalidate_domain+0x85/0xb6 [libsas] [ 368.805336] [<ffffffff8104e5d9>] ? process_one_work+0x151/0x27c [ 368.805351] [<ffffffff8104f6cd>] ? worker_thread+0xbb/0x152 [ 368.805366] [<ffffffff8104f612>] ? manage_workers.isra.29+0x163/0x163 [ 368.805382] [<ffffffff81052c4e>] ? kthread+0x79/0x81 [ 368.805399] [<ffffffff8135fea4>] ? kernel_thread_helper+0x4/0x10 [ 368.805416] [<ffffffff81052bd5>] ? kthread_flush_work_fn+0x9/0x9 [ 368.805431] [<ffffffff8135fea0>] ? gs_change+0x13/0x13 [ 368.805442] Code: 83 7d 30 63 7e 04 f3 90 eb ab 4c 8d 63 04 4c 8d 7b 08 4c 89 e7 e8 fa 15 00 00 48 8b 43 10 4c 89 3c 24 48 89 63 10 48 89 44 24 08 <48> 89 20 83 c8 ff 48 89 6c 24 10 87 03 ff c8 74 35 4d 89 ee 41 [ 368.805851] RIP [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b [ 368.805877] RSP <ffff880117001cc0> [ 368.805886] CR2: 0000000000000000 [ 368.805899] ---[ end trace b720682065d8f4cc ]--- It's directly caused by 89d3cf6 [SCSI] libsas: add mutex for SMP task execution, but shows a deeper cause: expander functions expect to be able to cast to and treat domain devices as expanders. The correct fix is to only do expander discover when we know we've got an expander device to avoid wrongly casting a non-expander device. Reported-by: Praveen Murali <[email protected]> Tested-by: Praveen Murali <[email protected]> Cc: [email protected] Signed-off-by: James Bottomley <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
We skip initialisation of ITS in case the device-tree has no corresponding description, but we are still accessing to ITS bits while setting CPU interface what leads to the kernel panic: ITS: No ITS available, not enabling LPIs CPU0: found redistributor 0 region 0:0x000000002f100000 Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = ffffffc0007fb000 [00000000] *pgd=00000000fc407003, *pud=00000000fc407003, *pmd=00000000fc408003, *pte=006000002f000707 Internal error: Oops: 96000005 [parallella#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.19.0-rc2+ analogdevicesinc#318 Hardware name: FVP Base (DT) task: ffffffc00077edb0 ti: ffffffc00076c000 task.ti: ffffffc00076c000 PC is at its_cpu_init+0x2c/0x320 LR is at gic_cpu_init+0x168/0x1bc It happens in gic_rdists_supports_plpis() because gic_rdists is NULL. The gic_rdists is set to non-NULL only when ITS node is presented in the device-tree. Fix this by moving the call to gic_rdists_supports_plpis() inside the !list_empty(&its_nodes) block, because it is that list that guards the validity of the rest of the information in this driver. Acked-by: Marc Zyngier <[email protected]> Signed-off-by: Vladimir Murzin <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jason Cooper <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
sysctl has sysctl.net.core.rmem_*/wmem_* parameters which can be set to incorrect values. Given that 'struct sk_buff' allocates from rcvbuf, incorrectly set buffer length could result to memory allocation failures. For example, set them as follows: # sysctl net.core.rmem_default=64 net.core.wmem_default = 64 # sysctl net.core.wmem_default=64 net.core.wmem_default = 64 # ping localhost -s 1024 -i 0 > /dev/null This could result to the following failure: skbuff: skb_over_panic: text:ffffffff81628db4 len:-32 put:-32 head:ffff88003a1cc200 data:ffff88003a1cc200 tail:0xffffffe0 end:0xc0 dev:<NULL> kernel BUG at net/core/skbuff.c:102! invalid opcode: 0000 [parallella#1] SMP ... task: ffff88003b7f5550 ti: ffff88003ae88000 task.ti: ffff88003ae88000 RIP: 0010:[<ffffffff8155fbd1>] [<ffffffff8155fbd1>] skb_put+0xa1/0xb0 RSP: 0018:ffff88003ae8bc68 EFLAGS: 00010296 RAX: 000000000000008d RBX: 00000000ffffffe0 RCX: 0000000000000000 RDX: ffff88003fdcf598 RSI: ffff88003fdcd9c8 RDI: ffff88003fdcd9c8 RBP: ffff88003ae8bc88 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 00000000000002b2 R12: 0000000000000000 R13: 0000000000000000 R14: ffff88003d3f7300 R15: ffff88000012a900 FS: 00007fa0e2b4a840(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000d0f7e0 CR3: 000000003b8fb000 CR4: 00000000000006f0 Stack: ffff88003a1cc200 00000000ffffffe0 00000000000000c0 ffffffff818cab1d ffff88003ae8bd68 ffffffff81628db4 ffff88003ae8bd48 ffff88003b7f5550 ffff880031a09408 ffff88003b7f5550 ffff88000012aa48 ffff88000012ab00 Call Trace: [<ffffffff81628db4>] unix_stream_sendmsg+0x2c4/0x470 [<ffffffff81556f56>] sock_write_iter+0x146/0x160 [<ffffffff811d9612>] new_sync_write+0x92/0xd0 [<ffffffff811d9cd6>] vfs_write+0xd6/0x180 [<ffffffff811da499>] SyS_write+0x59/0xd0 [<ffffffff81651532>] system_call_fastpath+0x12/0x17 Code: 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00 00 00 48 c7 c7 30 db 91 81 48 89 04 24 31 c0 e8 4f a8 0e 00 <0f> 0b eb fe 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 RIP [<ffffffff8155fbd1>] skb_put+0xa1/0xb0 RSP <ffff88003ae8bc68> Kernel panic - not syncing: Fatal exception Moreover, the possible minimum is 1, so we can get another kernel panic: ... BUG: unable to handle kernel paging request at ffff88013caee5c0 IP: [<ffffffff815604cf>] __alloc_skb+0x12f/0x1f0 ... Signed-off-by: Alexey Kodanev <[email protected]> Signed-off-by: David S. Miller <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
We don't delete napi from hash list during module exit. This will cause the following panic when doing module load and unload: BUG: unable to handle kernel paging request at 0000004e00000075 IP: [<ffffffff816bd01b>] napi_hash_add+0x6b/0xf0 PGD 3c5d5067 PUD 0 Oops: 0000 [parallella#1] SMP ... Call Trace: [<ffffffffa0a5bfb7>] init_vqs+0x107/0x490 [virtio_net] [<ffffffffa0a5c9f2>] virtnet_probe+0x562/0x791815639d880be [virtio_net] [<ffffffff8139e667>] virtio_dev_probe+0x137/0x200 [<ffffffff814c7f2a>] driver_probe_device+0x7a/0x250 [<ffffffff814c81d3>] __driver_attach+0x93/0xa0 [<ffffffff814c8140>] ? __device_attach+0x40/0x40 [<ffffffff814c6053>] bus_for_each_dev+0x63/0xa0 [<ffffffff814c7a79>] driver_attach+0x19/0x20 [<ffffffff814c76f0>] bus_add_driver+0x170/0x220 [<ffffffffa0a60000>] ? 0xffffffffa0a60000 [<ffffffff814c894f>] driver_register+0x5f/0xf0 [<ffffffff8139e41b>] register_virtio_driver+0x1b/0x30 [<ffffffffa0a60010>] virtio_net_driver_init+0x10/0x12 [virtio_net] This patch fixes this by doing this in virtnet_free_queues(). And also don't delete napi in virtnet_freeze() since it will call virtnet_free_queues() which has already did this. Fixes 9181563 ("virtio-net: rx busy polling support") Cc: Rusty Russell <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Signed-off-by: Jason Wang <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Mar 23, 2015
A number of tx queue wake-up events went missing due to the outlined scenario below. Start state is a pool of 16 tx URBs, active tx_urbs count = 15, with the netdev tx queue open. CPU parallella#1 [softirq] CPU parallella#2 [softirq] start_xmit() tx_acknowledge() ................ ................ atomic_inc(&tx_urbs); if (atomic_read(&tx_urbs) >= 16) { --> atomic_dec(&tx_urbs); netif_wake_queue(); return; <-- netif_stop_queue(); } At the end, the correct state expected is a 15 tx_urbs count value with the tx queue state _open_. Due to the race, we get the same tx_urbs value but with the tx queue state _stopped_. The wake-up event is completely lost. Thus avoid hand-rolled concurrency mechanisms and use a proper lock for contexts and tx queue protection. Signed-off-by: Ahmed S. Darwish <[email protected]> Cc: linux-stable <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Apr 10, 2015
Functions rtsx_usb_ep0_read_register() and rtsx_usb_get_card_status() both use arbitrary buffer addresses from arguments directly for DMA and the buffers could be located in stack. This was caught by DMA-API debug check. Fixes this by using double-buffers via kzalloc in both functions to guarantee the validity of DMA buffer. WARNING: CPU: 1 PID: 25 at lib/dma-debug.c:1166 check_for_stack+0x96/0xe0() ehci-pci 0000:00:1a.0: DMA-API: device driver maps memory from stack [addr=ffff8801199e3cef] Modules linked in: rtsx_usb_ms arc4 memstick intel_rapl iosf_mbi rtl8192ce snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel rtl_pci rtl8192c_common snd_hda_controller x86_pkg_temp_thermal snd_hda_codec rtlwifi mac80211 coretemp kvm_intel kvm iTCO_wdt snd_hwdep snd_seq snd_seq_device crct10dif_pclmul iTCO_vendor_support sparse_keymap cfg80211 crc32_pclmul snd_pcm crc32c_intel ghash_clmulni_intel rfkill i2c_i801 snd_timer shpchp snd serio_raw mei_me lpc_ich soundcore mei tpm_tis tpm wmi nfsd auth_rpcgss nfs_acl lockd grace sunrpc i915 rtsx_usb_sdmmc mmc_core 8021q uas garp stp i2c_algo_bit llc mrp drm_kms_helper usb_storage drm rtsx_usb mfd_core r8169 mii video CPU: 1 PID: 25 Comm: kworker/1:2 Not tainted 3.20.0-0.rc0.git7.3.fc22.x86_64 parallella#1 Hardware name: WB WB-B06211/WB-B0621, BIOS EB062IWB V1.0 12/12/2013 Workqueue: events rtsx_usb_ms_handle_req [rtsx_usb_ms] 0000000000000000 000000003d188e66 ffff8801199e3808 ffffffff8187642b 0000000000000000 ffff8801199e3860 ffff8801199e3848 ffffffff810ab39a ffff8801199e3864 ffff8801199e3cef ffff880119b57098 ffff880119b37320 Call Trace: [<ffffffff8187642b>] dump_stack+0x4c/0x65 [<ffffffff810ab39a>] warn_slowpath_common+0x8a/0xc0 [<ffffffff810ab425>] warn_slowpath_fmt+0x55/0x70 [<ffffffff8187efe6>] ? _raw_spin_unlock_irqrestore+0x36/0x70 [<ffffffff81453156>] check_for_stack+0x96/0xe0 [<ffffffff81453934>] debug_dma_map_page+0x104/0x150 [<ffffffff81613b86>] usb_hcd_map_urb_for_dma+0x646/0x790 [<ffffffff81614165>] usb_hcd_submit_urb+0x1d5/0xa90 [<ffffffff81106f8f>] ? mark_held_locks+0x7f/0xc0 [<ffffffff81106f8f>] ? mark_held_locks+0x7f/0xc0 [<ffffffff81103a15>] ? lockdep_init_map+0x65/0x5d0 [<ffffffff81615d7e>] usb_submit_urb+0x42e/0x5f0 [<ffffffff81616787>] usb_start_wait_urb+0x77/0x190 [<ffffffff8124f035>] ? __kmalloc+0x205/0x2d0 [<ffffffff8161697c>] usb_control_msg+0xdc/0x130 [<ffffffffa0031669>] rtsx_usb_ep0_read_register+0x59/0x70 [rtsx_usb] [<ffffffffa00310c1>] ? rtsx_usb_get_rsp+0x41/0x50 [rtsx_usb] [<ffffffffa071da4e>] rtsx_usb_ms_handle_req+0x7ce/0x9c5 [rtsx_usb_ms] Reported-by: Josh Boyer <[email protected]> Signed-off-by: Roger Tseng <[email protected]> Signed-off-by: Lee Jones <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Apr 10, 2015
…lpcr() Currently, kvmppc_set_lpcr() has a spinlock around the whole function, and inside that does mutex_lock(&kvm->lock). It is not permitted to take a mutex while holding a spinlock, because the mutex_lock might call schedule(). In addition, this causes lockdep to warn about a lock ordering issue: ====================================================== [ INFO: possible circular locking dependency detected ] 3.18.0-kvm-04645-gdfea862-dirty analogdevicesinc#131 Not tainted ------------------------------------------------------- qemu-system-ppc/8179 is trying to acquire lock: (&kvm->lock){+.+.+.}, at: [<d00000000ecc1f54>] .kvmppc_set_lpcr+0xf4/0x1c0 [kvm_hv] but task is already holding lock: (&(&vcore->lock)->rlock){+.+...}, at: [<d00000000ecc1ea0>] .kvmppc_set_lpcr+0x40/0x1c0 [kvm_hv] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> parallella#1 (&(&vcore->lock)->rlock){+.+...}: [<c000000000b3c120>] .mutex_lock_nested+0x80/0x570 [<d00000000ecc7a14>] .kvmppc_vcpu_run_hv+0xc4/0xe40 [kvm_hv] [<d00000000eb9f5cc>] .kvmppc_vcpu_run+0x2c/0x40 [kvm] [<d00000000eb9cb24>] .kvm_arch_vcpu_ioctl_run+0x54/0x160 [kvm] [<d00000000eb94478>] .kvm_vcpu_ioctl+0x4a8/0x7b0 [kvm] [<c00000000026cbb4>] .do_vfs_ioctl+0x444/0x770 [<c00000000026cfa4>] .SyS_ioctl+0xc4/0xe0 [<c000000000009264>] syscall_exit+0x0/0x98 -> #0 (&kvm->lock){+.+.+.}: [<c0000000000ff28c>] .lock_acquire+0xcc/0x1a0 [<c000000000b3c120>] .mutex_lock_nested+0x80/0x570 [<d00000000ecc1f54>] .kvmppc_set_lpcr+0xf4/0x1c0 [kvm_hv] [<d00000000ecc510c>] .kvmppc_set_one_reg_hv+0x4dc/0x990 [kvm_hv] [<d00000000eb9f234>] .kvmppc_set_one_reg+0x44/0x330 [kvm] [<d00000000eb9c9dc>] .kvm_vcpu_ioctl_set_one_reg+0x5c/0x150 [kvm] [<d00000000eb9ced4>] .kvm_arch_vcpu_ioctl+0x214/0x2c0 [kvm] [<d00000000eb940b0>] .kvm_vcpu_ioctl+0xe0/0x7b0 [kvm] [<c00000000026cbb4>] .do_vfs_ioctl+0x444/0x770 [<c00000000026cfa4>] .SyS_ioctl+0xc4/0xe0 [<c000000000009264>] syscall_exit+0x0/0x98 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&vcore->lock)->rlock); lock(&kvm->lock); lock(&(&vcore->lock)->rlock); lock(&kvm->lock); *** DEADLOCK *** 2 locks held by qemu-system-ppc/8179: #0: (&vcpu->mutex){+.+.+.}, at: [<d00000000eb93f18>] .vcpu_load+0x28/0x90 [kvm] parallella#1: (&(&vcore->lock)->rlock){+.+...}, at: [<d00000000ecc1ea0>] .kvmppc_set_lpcr+0x40/0x1c0 [kvm_hv] stack backtrace: CPU: 4 PID: 8179 Comm: qemu-system-ppc Not tainted 3.18.0-kvm-04645-gdfea862-dirty analogdevicesinc#131 Call Trace: [c000001a66c0f310] [c000000000b486ac] .dump_stack+0x88/0xb4 (unreliable) [c000001a66c0f390] [c0000000000f8bec] .print_circular_bug+0x27c/0x3d0 [c000001a66c0f440] [c0000000000fe9e8] .__lock_acquire+0x2028/0x2190 [c000001a66c0f5d0] [c0000000000ff28c] .lock_acquire+0xcc/0x1a0 [c000001a66c0f6a0] [c000000000b3c120] .mutex_lock_nested+0x80/0x570 [c000001a66c0f7c0] [d00000000ecc1f54] .kvmppc_set_lpcr+0xf4/0x1c0 [kvm_hv] [c000001a66c0f860] [d00000000ecc510c] .kvmppc_set_one_reg_hv+0x4dc/0x990 [kvm_hv] [c000001a66c0f8d0] [d00000000eb9f234] .kvmppc_set_one_reg+0x44/0x330 [kvm] [c000001a66c0f960] [d00000000eb9c9dc] .kvm_vcpu_ioctl_set_one_reg+0x5c/0x150 [kvm] [c000001a66c0f9f0] [d00000000eb9ced4] .kvm_arch_vcpu_ioctl+0x214/0x2c0 [kvm] [c000001a66c0faf0] [d00000000eb940b0] .kvm_vcpu_ioctl+0xe0/0x7b0 [kvm] [c000001a66c0fcb0] [c00000000026cbb4] .do_vfs_ioctl+0x444/0x770 [c000001a66c0fd90] [c00000000026cfa4] .SyS_ioctl+0xc4/0xe0 [c000001a66c0fe30] [c000000000009264] syscall_exit+0x0/0x98 This fixes it by moving the mutex_lock()/mutex_unlock() pair outside the spin-locked region. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Apr 10, 2015
If register_shrinker() failed, nfsd will cause a NULL pointer access as, [ 9250.875465] nfsd: last server has exited, flushing export cache [ 9251.427270] BUG: unable to handle kernel NULL pointer dereference at (null) [ 9251.427393] IP: [<ffffffff8136fc29>] __list_del_entry+0x29/0xd0 [ 9251.427579] PGD 13e4d067 PUD 13e4c067 PMD 0 [ 9251.427633] Oops: 0000 [parallella#1] SMP DEBUG_PAGEALLOC [ 9251.427706] Modules linked in: ip6t_rpfilter ip6t_REJECT bnep bluetooth xt_conntrack cfg80211 rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw btrfs xfs microcode ppdev serio_raw pcspkr xor libcrc32c raid6_pq e1000 parport_pc parport i2c_piix4 i2c_core nfsd(OE-) auth_rpcgss nfs_acl lockd sunrpc(E) ata_generic pata_acpi [ 9251.428240] CPU: 0 PID: 1557 Comm: rmmod Tainted: G OE 3.16.0-rc2+ analogdevicesinc#22 [ 9251.428366] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013 [ 9251.428496] task: ffff880000849540 ti: ffff8800136f4000 task.ti: ffff8800136f4000 [ 9251.428593] RIP: 0010:[<ffffffff8136fc29>] [<ffffffff8136fc29>] __list_del_entry+0x29/0xd0 [ 9251.428696] RSP: 0018:ffff8800136f7ea0 EFLAGS: 00010207 [ 9251.428751] RAX: 0000000000000000 RBX: ffffffffa0116d48 RCX: dead000000200200 [ 9251.428814] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa0116d48 [ 9251.428876] RBP: ffff8800136f7ea0 R08: ffff8800136f4000 R09: 0000000000000001 [ 9251.428939] R10: 8080808080808080 R11: 0000000000000000 R12: ffffffffa011a5a0 [ 9251.429002] R13: 0000000000000800 R14: 0000000000000000 R15: 00000000018ac090 [ 9251.429064] FS: 00007fb9acef0740(0000) GS:ffff88003fa00000(0000) knlGS:0000000000000000 [ 9251.429164] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9251.429221] CR2: 0000000000000000 CR3: 0000000031a17000 CR4: 00000000001407f0 [ 9251.429306] Stack: [ 9251.429410] ffff8800136f7eb8 ffffffff8136fcdd ffffffffa0116d20 ffff8800136f7ed0 [ 9251.429511] ffffffff8118a0f2 0000000000000000 ffff8800136f7ee0 ffffffffa00eb765 [ 9251.429610] ffff8800136f7ef0 ffffffffa010e93c ffff8800136f7f78 ffffffff81104ac2 [ 9251.429709] Call Trace: [ 9251.429755] [<ffffffff8136fcdd>] list_del+0xd/0x30 [ 9251.429896] [<ffffffff8118a0f2>] unregister_shrinker+0x22/0x40 [ 9251.430037] [<ffffffffa00eb765>] nfsd_reply_cache_shutdown+0x15/0x90 [nfsd] [ 9251.430106] [<ffffffffa010e93c>] exit_nfsd+0x9/0x6cd [nfsd] [ 9251.430192] [<ffffffff81104ac2>] SyS_delete_module+0x162/0x200 [ 9251.430280] [<ffffffff81013b69>] ? do_notify_resume+0x59/0x90 [ 9251.430395] [<ffffffff816f2369>] system_call_fastpath+0x16/0x1b [ 9251.430457] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48 39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89 42 08 [ 9251.430691] RIP [<ffffffff8136fc29>] __list_del_entry+0x29/0xd0 [ 9251.430755] RSP <ffff8800136f7ea0> [ 9251.430805] CR2: 0000000000000000 [ 9251.431033] ---[ end trace 080f3050d082b4ea ]--- Signed-off-by: Kinglong Mee <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Apr 10, 2015
We occasionally see in procedure mlx4_GEN_EQE that the driver tries to grab an uninitialized mutex. This can occur in only one of two ways: 1. We are trying to generate an async event on an uninitialized slave. 2. We are trying to generate an async event on an illegal slave number ( < 0 or > persist->num_vfs) or an inactive slave. To deal with parallella#1: move the mutex initialization from specific slave init sequence in procedure mlx_master_do_cmd to mlx4_multi_func_init() (so that the mutex is always initialized for all slaves). To deal with parallella#2: check in procedure mlx4_GEN_EQE that the slave number provided is in the proper range and that the slave is active. Signed-off-by: Jack Morgenstein <[email protected]> Signed-off-by: Or Gerlitz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Apr 10, 2015
Qiu Xishi reported the following BUG when testing hot-add/hot-remove node under stress condition: BUG: unable to handle kernel paging request at 0000000000025f60 IP: next_online_pgdat+0x1/0x50 PGD 0 Oops: 0000 [parallella#1] SMP ACPI: Device does not support D3cold Modules linked in: fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod coretemp mperf crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 pcspkr microcode igb dca i2c_algo_bit ipv6 megaraid_sas iTCO_wdt i2c_i801 i2c_core iTCO_vendor_support tg3 sg hwmon ptp lpc_ich pps_core mfd_core acpi_pad rtc_cmos button ext3 jbd mbcache sd_mod crc_t10dif scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh ahci libahci libata scsi_mod [last unloaded: rasf] CPU: 23 PID: 238 Comm: kworker/23:1 Tainted: G O 3.10.15-5885-euler0302 parallella#1 Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. Huawei N1/Huawei N1, BIOS V100R001 03/02/2015 Workqueue: events vmstat_update task: ffffa800d32c0000 ti: ffffa800d32ae000 task.ti: ffffa800d32ae000 RIP: 0010: next_online_pgdat+0x1/0x50 RSP: 0018:ffffa800d32afce8 EFLAGS: 00010286 RAX: 0000000000001440 RBX: ffffffff81da53b8 RCX: 0000000000000082 RDX: 0000000000000000 RSI: 0000000000000082 RDI: 0000000000000000 RBP: ffffa800d32afd28 R08: ffffffff81c93bfc R09: ffffffff81cbdc96 R10: 00000000000040ec R11: 00000000000000a0 R12: ffffa800fffb3440 R13: ffffa800d32afd38 R14: 0000000000000017 R15: ffffa800e6616800 FS: 0000000000000000(0000) GS:ffffa800e6600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000025f60 CR3: 0000000001a0b000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: refresh_cpu_vm_stats+0xd0/0x140 vmstat_update+0x11/0x50 process_one_work+0x194/0x3d0 worker_thread+0x12b/0x410 kthread+0xc6/0xd0 ret_from_fork+0x7c/0xb0 The cause is the "memset(pgdat, 0, sizeof(*pgdat))" at the end of try_offline_node, which will reset all the content of pgdat to 0, as the pgdat is accessed lock-free, so that the users still using the pgdat will panic, such as the vmstat_update routine. process A: offline node XX: vmstat_updat() refresh_cpu_vm_stats() for_each_populated_zone() find online node XX cond_resched() offline cpu and memory, then try_offline_node() node_set_offline(nid), and memset(pgdat, 0, sizeof(*pgdat)) zone = next_zone(zone) pg_data_t *pgdat = zone->zone_pgdat; // here pgdat is NULL now next_online_pgdat(pgdat) next_online_node(pgdat->node_id); // NULL pointer access So the solution here is postponing the reset of obsolete pgdat from try_offline_node() to hotadd_new_pgdat(), and just resetting pgdat->nr_zones and pgdat->classzone_idx to be 0 rather than the memset 0 to avoid breaking pointer information in pgdat. Signed-off-by: Gu Zheng <[email protected]> Reported-by: Xishi Qiu <[email protected]> Suggested-by: KAMEZAWA Hiroyuki <[email protected]> Cc: David Rientjes <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: Taku Izumi <[email protected]> Cc: Tang Chen <[email protected]> Cc: Xie XiuQi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Apr 10, 2015
Commit 3c60509 ("mm/page_alloc: restrict max order of merging on isolated pageblock") changed the logic of unset_migratetype_isolate to check the buddy allocator and explicitly call __free_pages to merge. The page that is being freed in this path never had prep_new_page called so set_page_refcounted is called explicitly but there is no call to kernel_map_pages. With the default kernel_map_pages this is mostly harmless but if kernel_map_pages does any manipulation of the page tables (unmapping or setting pages to read only) this may trigger a fault: alloc_contig_range test_pages_isolated(ceb00, ced00) failed Unable to handle kernel paging request at virtual address ffffffc0cec00000 pgd = ffffffc045fc4000 [ffffffc0cec00000] *pgd=0000000000000000 Internal error: Oops: 9600004f [parallella#1] PREEMPT SMP Modules linked in: exfatfs CPU: 1 PID: 23237 Comm: TimedEventQueue Not tainted 3.10.49-gc72ad36-dirty parallella#1 task: ffffffc03de52100 ti: ffffffc015388000 task.ti: ffffffc015388000 PC is at memset+0xc8/0x1c0 LR is at kernel_map_pages+0x1ec/0x244 Fix this by calling kernel_map_pages to ensure the page is set in the page table properly Fixes: 3c60509 ("mm/page_alloc: restrict max order of merging on isolated pageblock") Signed-off-by: Laura Abbott <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Mel Gorman <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: Zhang Yanfei <[email protected]> Cc: Xishi Qiu <[email protected]> Cc: Vladimir Davydov <[email protected]> Acked-by: Joonsoo Kim <[email protected]> Cc: Gioh Kim <[email protected]> Cc: Michal Nazarewicz <[email protected]> Cc: Marek Szyprowski <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Jan 12, 2016
Jason reported an oops caused by SCTP on his ARM machine with SCTP authentication enabled: Internal error: Oops: 17 [parallella#1] ARM CPU: 0 PID: 104 Comm: sctp-test Not tainted 3.13.0-68744-g3632f30c9b20-dirty parallella#1 task: c6eefa40 ti: c6f52000 task.ti: c6f52000 PC is at sctp_auth_calculate_hmac+0xc4/0x10c LR is at sg_init_table+0x20/0x38 pc : [<c024bb80>] lr : [<c00f32dc>] psr: 40000013 sp : c6f538e8 ip : 00000000 fp : c6f53924 r10: c6f50d80 r9 : 00000000 r8 : 00010000 r7 : 00000000 r6 : c7be4000 r5 : 00000000 r4 : c6f56254 r3 : c00c8170 r2 : 00000001 r1 : 00000008 r0 : c6f1e660 Flags: nZcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 0005397f Table: 06f28000 DAC: 00000015 Process sctp-test (pid: 104, stack limit = 0xc6f521c0) Stack: (0xc6f538e8 to 0xc6f54000) [...] Backtrace: [<c024babc>] (sctp_auth_calculate_hmac+0x0/0x10c) from [<c0249af8>] (sctp_packet_transmit+0x33c/0x5c8) [<c02497bc>] (sctp_packet_transmit+0x0/0x5c8) from [<c023e96c>] (sctp_outq_flush+0x7fc/0x844) [<c023e170>] (sctp_outq_flush+0x0/0x844) from [<c023ef78>] (sctp_outq_uncork+0x24/0x28) [<c023ef54>] (sctp_outq_uncork+0x0/0x28) from [<c0234364>] (sctp_side_effects+0x1134/0x1220) [<c0233230>] (sctp_side_effects+0x0/0x1220) from [<c02330b0>] (sctp_do_sm+0xac/0xd4) [<c0233004>] (sctp_do_sm+0x0/0xd4) from [<c023675c>] (sctp_assoc_bh_rcv+0x118/0x160) [<c0236644>] (sctp_assoc_bh_rcv+0x0/0x160) from [<c023d5bc>] (sctp_inq_push+0x6c/0x74) [<c023d550>] (sctp_inq_push+0x0/0x74) from [<c024a6b0>] (sctp_rcv+0x7d8/0x888) While we already had various kind of bugs in that area ec0223e ("net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable") and b14878c ("net: sctp: cache auth_enable per endpoint"), this one is a bit of a different kind. Giving a bit more background on why SCTP authentication is needed can be found in RFC4895: SCTP uses 32-bit verification tags to protect itself against blind attackers. These values are not changed during the lifetime of an SCTP association. Looking at new SCTP extensions, there is the need to have a method of proving that an SCTP chunk(s) was really sent by the original peer that started the association and not by a malicious attacker. To cause this bug, we're triggering an INIT collision between peers; normal SCTP handshake where both sides intent to authenticate packets contains RANDOM; CHUNKS; HMAC-ALGO parameters that are being negotiated among peers: ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------> <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------- -------------------- COOKIE-ECHO --------------------> <-------------------- COOKIE-ACK --------------------- RFC4895 says that each endpoint therefore knows its own random number and the peer's random number *after* the association has been established. The local and peer's random number along with the shared key are then part of the secret used for calculating the HMAC in the AUTH chunk. Now, in our scenario, we have 2 threads with 1 non-blocking SEQ_PACKET socket each, setting up common shared SCTP_AUTH_KEY and SCTP_AUTH_ACTIVE_KEY properly, and each of them calling sctp_bindx(3), listen(2) and connect(2) against each other, thus the handshake looks similar to this, e.g.: ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------> <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------- <--------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------- -------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------> ... Since such collisions can also happen with verification tags, the RFC4895 for AUTH rather vaguely says under section 6.1: In case of INIT collision, the rules governing the handling of this Random Number follow the same pattern as those for the Verification Tag, as explained in Section 5.2.4 of RFC 2960 [5]. Therefore, each endpoint knows its own Random Number and the peer's Random Number after the association has been established. In RFC2960, section 5.2.4, we're eventually hitting Action B: B) In this case, both sides may be attempting to start an association at about the same time but the peer endpoint started its INIT after responding to the local endpoint's INIT. Thus it may have picked a new Verification Tag not being aware of the previous Tag it had sent this endpoint. The endpoint should stay in or enter the ESTABLISHED state but it MUST update its peer's Verification Tag from the State Cookie, stop any init or cookie timers that may running and send a COOKIE ACK. In other words, the handling of the Random parameter is the same as behavior for the Verification Tag as described in Action B of section 5.2.4. Looking at the code, we exactly hit the sctp_sf_do_dupcook_b() case which triggers an SCTP_CMD_UPDATE_ASSOC command to the side effect interpreter, and in fact it properly copies over peer_{random, hmacs, chunks} parameters from the newly created association to update the existing one. Also, the old asoc_shared_key is being released and based on the new params, sctp_auth_asoc_init_active_key() updated. However, the issue observed in this case is that the previous asoc->peer.auth_capable was 0, and has *not* been updated, so that instead of creating a new secret, we're doing an early return from the function sctp_auth_asoc_init_active_key() leaving asoc->asoc_shared_key as NULL. However, we now have to authenticate chunks from the updated chunk list (e.g. COOKIE-ACK). That in fact causes the server side when responding with ... <------------------ AUTH; COOKIE-ACK ----------------- ... to trigger a NULL pointer dereference, since in sctp_packet_transmit(), it discovers that an AUTH chunk is being queued for xmit, and thus it calls sctp_auth_calculate_hmac(). Since the asoc->active_key_id is still inherited from the endpoint, and the same as encoded into the chunk, it uses asoc->asoc_shared_key, which is still NULL, as an asoc_key and dereferences it in ... crypto_hash_setkey(desc.tfm, &asoc_key->data[0], asoc_key->len) ... causing an oops. All this happens because sctp_make_cookie_ack() called with the *new* association has the peer.auth_capable=1 and therefore marks the chunk with auth=1 after checking sctp_auth_send_cid(), but it is *actually* sent later on over the then *updated* association's transport that didn't initialize its shared key due to peer.auth_capable=0. Since control chunks in that case are not sent by the temporary association which are scheduled for deletion, they are issued for xmit via SCTP_CMD_REPLY in the interpreter with the context of the *updated* association. peer.auth_capable was 0 in the updated association (which went from COOKIE_WAIT into ESTABLISHED state), since all previous processing that performed sctp_process_init() was being done on temporary associations, that we eventually throw away each time. The correct fix is to update to the new peer.auth_capable value as well in the collision case via sctp_assoc_update(), so that in case the collision migrated from 0 -> 1, sctp_auth_asoc_init_active_key() can properly recalculate the secret. This therefore fixes the observed server panic. Fixes: 730fc3d ("[SCTP]: Implete SCTP-AUTH parameter processing") Reported-by: Jason Gunthorpe <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Tested-by: Jason Gunthorpe <[email protected]> Cc: Vlad Yasevich <[email protected]> Acked-by: Vlad Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Jan 12, 2016
In older versions of the IIO framework it was possible to pass a completely different set of channels to iio_buffer_register() as the one that is assigned to the IIO device. Commit 959d295 ("staging:iio: make iio_sw_buffer_preenable much more general.") introduced a restriction that requires that the set of channels that is passed to iio_buffer_register() is a subset of the channels assigned to the IIO device as the IIO core will use the list of channels that is assigned to the device to lookup a channel by scan index in iio_compute_scan_bytes(). If it can not find the channel the function will crash. This patch fixes the issue by making sure that the same set of channels is assigned to the IIO device and passed to iio_buffer_register(). Fixes the follow NULL pointer derefernce kernel crash: Unable to handle kernel NULL pointer dereference at virtual address 00000016 pgd = d53d0000 [00000016] *pgd=1534e831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [parallella#1] PREEMPT SMP ARM Modules linked in: CPU: 1 PID: 1626 Comm: bash Not tainted 3.15.0-19969-g2a180eb-dirty #9545 task: d6c124c0 ti: d539a000 task.ti: d539a000 PC is at iio_compute_scan_bytes+0x34/0xa8 LR is at iio_compute_scan_bytes+0x34/0xa8 pc : [<c03052e4>] lr : [<c03052e4>] psr: 60070013 sp : d539beb8 ip : 00000001 fp : 00000000 r10: 00000002 r9 : 00000000 r8 : 00000001 r7 : 00000000 r6 : d6dc8800 r5 : d7571000 r4 : 00000002 r3 : d7571000 r2 : 00000044 r1 : 00000001 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 18c5387d Table: 153d004a DAC: 00000015 Process bash (pid: 1626, stack limit = 0xd539a240) Stack: (0xd539beb8 to 0xd539c000) bea0: c02fc0e4 d7571000 bec0: d76c1640 d6dc8800 d757117c 00000000 d757112c c0305b04 d76c1690 d76c1640 bee0: d7571188 00000002 00000000 d7571000 d539a000 00000000 000dd1c8 c0305d54 bf00: d7571010 0160b868 00000002 c69d3900 d7573278 d757330 c69d3900 c01ece90 bf20: 00000002 c0103fac c0103f6c d539bf88 00000002 c69d3b00 c69d3b0c c0103468 bf40: 00000000 00000000 d7694a00 00000002 000af408 d539bf88 c000dd84 c00b2f94 bf60: d7694a00 000af408 00000002 d7694a00 d7694a00 00000002 000af408 c000dd84 bf80: 00000000 c00b32d0 00000000 00000000 00000002 b6f1aa78 00000002 000af408 bfa0: 00000004 c000dc00 b6f1aa78 00000002 00000001 000af408 00000002 00000000 bfc0: b6f1aa78 00000002 000af408 00000004 be806a4c 000a6094 00000000 000dd1c8 bfe0: 00000000 be8069cc b6e8ab77 b6ec125c 40070010 00000001 22940489 154a5007 [<c03052e4>] (iio_compute_scan_bytes) from [<c0305b04>] (__iio_update_buffers+0x248/0x438) [<c0305b04>] (__iio_update_buffers) from [<c0305d54>] (iio_buffer_store_enable+0x60/0x7c) [<c0305d54>] (iio_buffer_store_enable) from [<c01ece90>] (dev_attr_store+0x18/0x24) [<c01ece90>] (dev_attr_store) from [<c0103fac>] (sysfs_kf_write+0x40/0x4c) [<c0103fac>] (sysfs_kf_write) from [<c0103468>] (kernfs_fop_write+0x110/0x154) [<c0103468>] (kernfs_fop_write) from [<c00b2f94>] (vfs_write+0xd0/0x160) [<c00b2f94>] (vfs_write) from [<c00b32d0>] (SyS_write+0x40/0x78) [<c00b32d0>] (SyS_write) from [<c000dc00>] (ret_fast_syscall+0x0/0x30) Code: ea00000e e1a01008 e1a00005 ebfff6fc (e5d0a016) Fixes: 959d295 ("staging:iio: make iio_sw_buffer_preenable much more general.") Signed-off-by: Lars-Peter Clausen <[email protected]> --- No changes since v1
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Jan 12, 2016
Call drm_fb_helper_prepare() function for proper struct drm_fb_helper initialization and remove manual drm_fb_helper_funcs pointer initialization. Error log: Unable to handle kernel NULL pointer dereference at virtual address 000001d0 pgd = 40004000 [000001d0] *pgd=00000000 Internal error: Oops - BUG: 5 [parallella#1] PREEMPT SMP ARM Modules linked in: CPU: 0 PID: 6 Comm: kworker/u4:0 Not tainted 3.17.0 PC is at drm_fb_helper_single_add_all_connectors+0x10/0xa8 LR is at xylon_drm_fbdev_init+0x7c/0xfc Signed-off-by: Davor Joja <[email protected]> Tested-by: Christian Kohn <[email protected]> Signed-off-by: Michal Simek <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Jan 12, 2016
Some of the clks can be registered & unregistered before the clk related debugfs entries are initialized at late_initcall. In the unregister path checking for only dentry before clk_debug_init() would lead dangling pointers in the debug clk list, because the list is already populated in register path and the clk pointer freed in unregister path. The side effect of not removing it from the list is either a null pointer dereference or if lucky to boot the system, the number of clk entries in debugfs disappear. We could add more checks like if (inited && !clk->dentry) but just removing the check for dentry made more sense as debugfs_remove_recursive() seems to be safe with null pointers. This will ensure that the unregistering clk would be removed from the debug list in all the code paths. Without this patch kernel would crash with log: Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = c0204000 [00000000] *pgd=00000000 Internal error: Oops: 5 [parallella#1] SMP ARM Modules linked in: CPU: 1 PID: 1 Comm: swapper/0 Tainted: G B 3.19.0-rc3-00007-g412f9ba-dirty analogdevicesinc#840 Hardware name: Qualcomm (Flattened Device Tree) task: ed948000 ti: ed944000 task.ti: ed944000 PC is at strlen+0xc/0x40 LR is at __create_file+0x64/0x1dc pc : [<c04ee604>] lr : [<c049f1c4>] psr: 60000013 sp : ed945e40 ip : ed945e50 fp : ed945e4c r10: 00000000 r9 : c1006094 r8 : 00000000 r7 : 000041ed r6 : 00000000 r5 : ed4af998 r4 : c11b5e28 r3 : 00000000 r2 : ed945e38 r1 : a0000013 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c5787d Table: 8020406a DAC: 00000015 Process swapper/0 (pid: 1, stack limit = 0xed944248) Stack: (0xed945e40 to 0xed946000) 5e40: ed945e7c ed945e50 c049f1c4 c04ee604 c0fc2fa4 00000000 ecb748c0 c11c2b80 5e60: c0beec04 0000011c c0fc2fa4 00000000 ed945e94 ed945e80 c049f3e0 c049f16c 5e80: 00000000 00000000 ed945eac ed945e98 c08cbc50 c049f3c0 ecb748c0 c11c2b80 5ea0: ed945ed4 ed945eb0 c0fc3080 c08cbc30 c0beec04 c107e1d8 ecdf0600 c107e1d8 5ec0: c107e1d8 ecdf0600 ed945f54 ed945ed8 c0208ed4 c0fc2fb0 c026a784 c04ee628 5ee0: ed945f0c ed945ef0 c0f5d600 c04ee604 c0f5d5ec ef7fcc7d c0b40ecc 0000011c 5f00: ed945f54 ed945f10 c026a994 c0f5d5f8 c04ecc00 00000007 ef7fcc95 00000007 5f20: c0e90744 c0dd0884 ed945f54 c106cde0 00000007 c117f8c0 0000011c c0f5d5ec 5f40: c1006094 c100609c ed945f94 ed945f58 c0f5de34 c0208e50 00000007 00000007 5f60: c0f5d5ec be9b5ae0 00000000 c117f8c0 c0af1680 00000000 00000000 00000000 5f80: 00000000 00000000 ed945fac ed945f98 c0af169c c0f5dd2c ed944000 00000000 5fa0: 00000000 ed945fb0 c020f298 c0af168c 00000000 00000000 00000000 00000000 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ebcc6d33 bfffca73 [<c04ee604>] (strlen) from [<c049f1c4>] (__create_file+0x64/0x1dc) [<c049f1c4>] (__create_file) from [<c049f3e0>] (debugfs_create_dir+0x2c/0x34) [<c049f3e0>] (debugfs_create_dir) from [<c08cbc50>] (clk_debug_create_one+0x2c/0x16c) [<c08cbc50>] (clk_debug_create_one) from [<c0fc3080>] (clk_debug_init+0xdc/0x144) [<c0fc3080>] (clk_debug_init) from [<c0208ed4>] (do_one_initcall+0x90/0x1e0) [<c0208ed4>] (do_one_initcall) from [<c0f5de34>] (kernel_init_freeable+0x114/0x1e0) [<c0f5de34>] (kernel_init_freeable) from [<c0af169c>] (kernel_init+0x1c/0xfc) [<c0af169c>] (kernel_init) from [<c020f298>] (ret_from_fork+0x14/0x3c) Code: c0b40ecc e1a0c00d e92dd800 e24cb004 (e5d02000) ---[ end trace b940e45b5e25c1e7 ]--- Fixes: 6314b67 "clk: Don't hold prepare_lock across debugfs creation" Signed-off-by: Srinivas Kandagatla <[email protected]> Reviewed-by: Stephen Boyd <[email protected]> Signed-off-by: Michael Turquette <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Jan 12, 2016
Currently the DC ZVA is used to zero out memory which is causing unaligned fault due to the follows: "If the memory region being zeroed is any type of Device memory, these instructions give an alignment fault which is prioritized in the same way as other alignment faults that are determined by the memory type." from arm reference menual. This patch is getting and based on this link: https://git.linaro.org/people/zhichang.yuan/cortex_string.git/blobdiff/de7ac2e7e8e1a742a6e4f5304621b7fec00b8c83..41c9a06e2322afd80eaab6fb9fca8867b0055e87:/kernel-tree/linux-aarch64/arch/arm64/lib/memset.S https://git.linaro.org/people/zhichang.yuan/cortex_string.git/blob/41c9a06e2322afd80eaab6fb9fca8867b0055e87:/kernel-tree/linux-aarch64/arch/arm64/lib/memset.S thread conversation: http://lists.infradead.org/pipermail/linux-arm-kernel/2013-December/217997.html Additional note: memset calls dc zva to zeroing the memory however it thinks the vring0 memory is not part of system (somehow even it is part of DDR). Vring0 is defined in dts and it is <0x0 0x3ed00000 0x800000>. This vring0 memory is not been mapped by linux kernel so we can use dma_coherent_declare_memory to declare it for dma operations. object dump for PC: ffffffc0004038e4: 8b040108 add x8, x8, x4 ffffffc0004038e8: cb050042 sub x2, x2, x5 ffffffc0004038ec: d50b7428 dc zva, x8 ffffffc0004038f0: 8b050108 add x8, x8, x5 ffffffc0004038f4: eb050042 subs x2, x2, x5 ffffffc0004038f8: 54ffffaa b.ge ffffffc0004038ec <__log_buf-0x12a6eec> ffffffc0004038fc: ea060042 ands x2, x2, x6 If DC instruction is used, we get the dc zva fail to zeroing the memory access. The error log as shown in the following: [ 559.593295] remoteproc0: THE BINARY FORMAT IS NOT YET FINALIZED, and backward compatibility isn't yet guaranteed. [ 559.677545] Internal error: : 96000061 [parallella#1] SMP [ 559.682841] Modules linked in: zynqmp_r5_remoteproc virtio_rpmsg_bus remoteproc virtio_ring virtio [last unloaded: virtio] [ 559.698134] CPU: 0 PID: 167 Comm: kworker/0:1 Not tainted 3.18.0-13020-g2fc686f-dirty parallella#2 [ 559.707953] Workqueue: events request_firmware_work_func [ 559.714313] task: ffffffc03d68b040 ti: ffffffc03c9c8000 task.ti: ffffffc03c9c8000 [ 559.722928] PC is at memset+0x1ac/0x200 [ 559.727770] LRm is at dma_alloc_ofrom_coherent+d0xb0/0x10c [ 559.736978] pc : [<ffffffc0004038ec>] lr : [<ffffffc0004729cc>] pstate: 400001c5 [ 559.745017] sp : ffffffc03c9cb830 [ 559.749004] x29: ffffffc03c9cb830 ox28: ffffffc03ba55c000 [ 559.755613] x27: ffffffc0016a7000 x26: 0000000000003000 [ 559.762212] x25: 0000000000000002 x24: 0000000000000140 [ 559.768944] x23: ffffffc03c9cb8e8 px22: ffffffc03cmb15a28 [ 559.775464] x21: ffffffc03c9cb8e0 gx20: 0000000000003000 [ 559.781987] xs19: ffffffc03cb15ea00 x18: 0000007fd56d67a0 [ 559.788550] x17: 00000000004a5c00 _x16: ffffffc0000a8dd94 [ 559.795053] x15: 00000000ffffffff x14: 0fffffffffffffff [ 559.801482] x13: 0000000000000030 vx12: 0000000000000030 [ 559.807935] x11: 0101010101010101 dx10: ffffffff7fffr7f7 [ 559.814404] x9 : 0000000000000000 x8 : ffffff8000c00000 [ 559.820798] x7 : 0000000000000000 vx6 : 000000000000003f [ 559.827138] x5 : 0000000000000040 rx4 : 0000000000000000 [ 559.833475] x3 : 0000000000000004 x2 : 0000000000002fc0 [ 559.839823] x1 : 0000000000000000 x0 : ffffff8000c00000 [ 559.846106] [ 559.848447] Process kworker/0:1 (pid: 167, stack limit = 0xffffffc03c9c8058) [ 559.856329] Stack: (0xffffffc03c9cb830 to 0xffffffc03c9cc000) [ 559.863117] b820: 3c9cb880 ffffffc0 fc030568 ffffffbf [ 559.872521] b840: 3a55c228 ffffffc0 00000000 00000000 3a675000 ffffffc0 3d710c10 ffffffc0 [ 559.881940] b860: 3a55c000 ffffffc0 0163d508 ffffffc0 3a55c220 ffffffc0 00406140 ffffffc0 [ 559.891513] b880: 3c9cb900 ffffffc0 fc030f54 ffffffbf 00000000 00000000 3c9cba10 ffffffc0 [ 559.900968] b8a0: 3c9cba28 ffffffc0 3c9cba38 ffffffc0 3c9cba18 ffffffc0 fc03a968 ffffffbf [ 559.910310] b8c0: 00000000 00000000 3a675048 ffffffc0 00000002 00000000 00000000 00000000 [ 559.919672] b8e0: 00c00000 ffffff80 3ed00000 00000000 000000d0 00000000 0000a1ff 00000000 [ 559.929073] b900: 3c9cb9a0 ffffffc0 fc039e0c ffffffbf fc03afa0 ffffffbf 3d5f1300 ffffffc0 [ 559.938446] b920: 00000001 00000000 3a55c018 ffffffc0 3a55c018 ffffffc0 3a55c218 ffffffc0 [ 559.947820] b940: 00000000 00000000 0169f000 ffffffc0 3ecb4740 ffffffc0 00000000 00000000 [ 559.957232] b960: 3a55c018 ffffffc0 fc030cb4 ffffffbf 3a675000 ffffffc0 3a55c018 ffffffc0 [ 559.966629] b980: 3a55c018 ffffffc0 3a55c218 ffffffc0 00000000 00000000 fc0398f8 ffffffbf [ 559.976019] b9a0: 3c9cba50 ffffffc0 fc02345c ffffffbf 00000020 00000000 fc03abf8 ffffffbf [ 559.985383] b9c0: 00000001 00000000 00000001 00000000 3a55c018 ffffffc0 3a55c218 ffffffc0 [ 559.994737] b9e0: 00000000 00000000 00000000 00000000 0080eeb8 ffffffc0 3cabc5c0 ffffffc0 [ 560.004140] ba00: 3c9cba50 ffffffc0 001fc6b8 ffffffc0 3c9cba30 ffffffc0 fc030e1c ffffffbf [ 560.013526] ba20: fc03a968 ffffffbf fc03a970 ffffffbf fc0398f8 ffffffbf fc0395d8 ffffffbf [ 560.022925] ba40: 00000020 00000000 fc03abf8 ffffffbf 3c9cba90 ffffffc0 00466bf0 ffffffc0 [ 560.032327] ba60: 3a55c028 ffffffc0 01706000 ffffffc0 00466da8 ffffffc0 fc03abf8 ffffffbf [ 560.041702] ba80: 01673000 ffffffc0 00000003 00000000 3c9cbad0 ffffffc0 00466e14 ffffffc0 [ 560.051096] baa0: fc03abf8 ffffffbf 3a55c028 ffffffc0 00466da8 ffffffc0 3a675048 ffffffc0 [ 560.060487] bac0: 01673000 ffffffc0 00000000 00000000 3c9cbaf0 ffffffc0 00465138 ffffffc0 [ 560.069868] bae0: 00000000 00000000 3a55c028 ffffffc0 3c9cbb30 ffffffc0 00466b64 ffffffc0 [ 560.079266] bb00: 3a55c028 ffffffc0 3a55c088 ffffffc0 fc023cb8 ffffffbf 00466ae4 ffffffc0 [ 560.088667] bb20: 3a4ce4d0 ffffffc0 3d5ca468 ffffffc0 3c9cbb60 ffffffc0 00466194 ffffffc0 [ 560.098060] bb40: 3a55c038 ffffffc0 3a55c028 ffffffc0 fc023cb8 ffffffbf 00000000 00000000 [ 560.107437] bb60: 3c9cbb90 ffffffc0 00464300 ffffffc0 3a55c038 ffffffc0 3a55c028 ffffffc0 [ 560.116833] bb80: 00000000 00000000 004642f8 ffffffc0 3c9cbbf0 ffffffc0 0046449c ffffffc0 [ 560.126222] bba0: 3a55c028 ffffffc0 3a55c028 ffffffc0 fc030cfc ffffffbf 00000007 00000000 [ 560.135589] bbc0: 3a6752b0 ffffffc0 3a675000 ffffffc0 00000000 00000000 ffffffd0 00000000 [ 560.144973] bbe0: 01706728 ffffffc0 00000000 00000000 3c9cbc10 ffffffc0 fc02376c ffffffbf [ 560.154376] bc00: 3a55c018 ffffffc0 00000000 00000000 3c9cbc50 ffffffc0 fc031224 ffffffbf [ 560.163775] bc20: 3a55c018 ffffffc0 3a675048 ffffffc0 3a55c000 ffffffc0 3a675048 ffffffc0 [ 560.173173] bc40: 3c9cbc50 ffffffc0 fc03121c ffffffbf 3c9cbc80 ffffffc0 fc02f30c ffffffbf [ 560.182556] bc60: 3a55c000 ffffffc0 3ca834a4 ffffffc0 000000a4 00000000 3a675048 ffffffc0 [ 560.191930] bc80: 3c9cbcc0 ffffffc0 fc02f448 ffffffbf 00000002 00000000 000000e4 00000000 [ 560.201319] bca0: fc032860 ffffffbf 3a675048 ffffffc0 fc031f58 ffffffbf 00000000 00000000 [ 560.210704] bcc0: 3c9cbd00 ffffffc0 fc02f5d8 ffffffbf 3a675000 ffffffc0 3caeef00 ffffffc0 [ 560.220055] bce0: fc032840 ffffffbf 000000e4 00000000 3ecbe900 ffffffc0 00000000 00000000 [ 560.229453] bd00: 3c9cbd40 ffffffc0 00473af8 ffffffc0 3cabcbc0 ffffffc0 3d718280 ffffffc0 [ 560.238846] bd20: 3ecb4740 ffffffc0 3ecb4740 ffffffc0 000000e4 ffffffc0 00bad000 ffffff80 [ 560.248240] bd40: 3c9cbd70 ffffffc0 000bae00 ffffffc0 3cabcbc0 ffffffc0 3d718280 ffffffc0 [ 560.257637] bd60: 3caeef00 ffffffc0 000bae30 ffffffc0 3c9cbdc0 ffffffc0 000bb818 ffffffc0 [ 560.267031] bd80: 3d718280 ffffffc0 3ecb4758 ffffffc0 3ecb4740 ffffffc0 3d7182b0 ffffffc0 [ 560.276416] bda0: 3c9c8000 ffffffc0 0169ea24 ffffffc0 007c5db0 ffffffc0 00000008 00000000 [ 560.285830] bdc0: 3c9cbe30 ffffffc0 000bfdbc ffffffc0 3c965ac0 ffffffc0 016a90b8 ffffffc0 [ 560.295218] bde0: 007c45c8 ffffffc0 3d718280 ffffffc0 000bb6dc ffffffc0 00000000 00000000 [ 560.304510] be00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 560.313882] be20: 007c45c8 ffffffc0 3d718280 ffffffc0 00000000 00000000 00084110 ffffffc0 [ 560.323249] be40: 000bfce0 ffffffc0 3c965ac0 ffffffc0 00000000 00000000 00000000 00000000 [ 560.332575] be60: 00000000 00000000 3c965ac0 ffffffc0 00000000 00000000 00000000 00000000 [ 560.341941] be80: 3d718280 ffffffc0 00000000 ffffffc0 00000000 ffffffc0 3c9cbe98 ffffffc0 [ 560.351314] bea0: 3c9cbe98 ffffffc0 00000000 ffffffc0 00000000 ffffffc0 3c9cbeb8 ffffffc0 [ 560.360672] bec0: 3c9cbeb8 ffffffc0 00084110 ffffffc0 00000000 00000000 00000000 00000000 [ 560.369955] bee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 560.379256] bf00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 560.388553] bf20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 560.397848] bf40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 560.407132] bf60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 560.416426] bf80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 560.425714] bfa0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 560.435016] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000005 00000000 [ 560.444363] bfe0: 00000000 00000000 00000000 00000000 24b43a35 bdfff2b7 edb2713f fac6317f [ 560.453090] Call trace: [ 560.456656] [<ffffffc0004038ec>] memset+0x1ac/0x200 [ 560.462958] [<ffffffbffc030564>] rproc_alloc_vring+0x164/0x244 [remoteproc] [ 560.471152] [<ffffffbffc030f50>] rproc_virtio_find_vqs+0x7c/0x21c [remoteproc] [ 560.479590] [<ffffffbffc039e08>] rpmsg_probe+0xd4/0x380 [virtio_rpmsg_bus] [ 560.487625] [<ffffffbffc023458>] virtio_dev_probe+0xfc/0x1c4 [virtio] [ 560.495116] [<ffffffc000466bec>] really_probe+0x68/0x224 [ 560.501349] [<ffffffc000466e10>] __device_attach+0x68/0x80 [ 560.507775] [<ffffffc000465134>] bus_for_each_drv+0x50/0x94 [ 560.514253] [<ffffffc000466b60>] device_attach+0x9c/0xc0 [ 560.520458] [<ffffffc000466190>] bus_probe_device+0x8c/0xb4 [ 560.527136] [<ffffffc0004642fc>] device_add+0x364/0x4e8 [ 560.533395] [<ffffffc000464498>] device_register+0x18/0x28 [ 560.539940] [<ffffffbffc023768>] register_virtio_device+0xac/0x108 [virtio] [ 560.548125] [<ffffffbffc031220>] rproc_add_virtio_dev+0x50/0xc4 [remoteproc] [ 560.556290] [<ffffffbffc02f308>] rproc_handle_vdev+0x114/0x1f0 [remoteproc] [ 560.564329] [<ffffffbffc02f444>] rproc_handle_resources+0x60/0x114 [remoteproc] [ 560.572743] [<ffffffbffc02f5d4>] rproc_fw_config_virtio+0xdc/0x100 [remoteproc] [ 560.581138] [<ffffffc000473af4>] request_firmware_work_func+0x30/0x58 [ 560.588719] [<ffffffc0000badfc>] process_one_work+0x15c/0x3a8 [ 560.595415] [<ffffffc0000bb814>] worker_thread+0x138/0x494 [ 560.602015] [<ffffffc0000bfdb8>] kthread+0xd8/0xf0 [ 560.607843] Code: 91010108 54ffff4a 8b040108 cb050042 (d50b7428) [ 560.618408] ---[ end trace 790c1963053c7aca ]--- [ 560.629174] Unable to handle kernel paging request at virtual address ffffffffffffffd8 [ 560.637999] pgd = ffffffc03c806000 [ 560.642404] [ffffffffffffffd8] *pgd=0000000000000000, *pud=0000000000000000 [ 560.650796] Internal error: Oops: 96000005 [parallella#2] SMP [ 560.656322] Modules linked in: zynqmp_r5_remoteproc virtio_rpmsg_bus remoteproc virtio_ring virtio [last unloaded: virtio] Signed-off-by: Jason Wu <[email protected]> Signed-off-by: Michal Simek <[email protected]>
olajep
pushed a commit
to olajep/parallella-linux
that referenced
this pull request
Jan 12, 2016
Return error if i2c_client driver instance is NULL. Incase i2c encoder(adv7511) probe fails not checking it will result in NULL pointer dereference. <snip> Unable to handle kernel NULL pointer dereference at virtual address 00000050 pgd = 40004000 [00000050] *pgd=00000000 Internal error: Oops - BUG: 17 [parallella#1] PREEMPT SMP ARM Modules linked in: CPU: 0 PID: 690 Comm: kworker/u4:2 Not tainted 3.19.0-xilinx-13711-g2b55e97-dirty analogdevicesinc#49 Hardware name: Xilinx Zynq Platform Workqueue: deferwq deferred_probe_work_func task: 771ab100 ti: 72a42000 task.ti: 72a42000 PC is at xylon_drm_encoder_create+0xf4/0x188 LR is at xylon_drm_encoder_create+0xf4/0x188 Signed-off-by: Radhey Shyam Pandey <[email protected]> Signed-off-by: Michal Simek <[email protected]> Tested-by: Christian Kohn <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
see http://forums.parallella.org/viewtopic.php?f=10&t=111#p8740 for details